View
215
Download
0
Category
Tags:
Preview:
Citation preview
Name-TAPASI PATI
Roll No-0401101238
042023 1
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED
TO
INTRUSION DETECTION
Cyber Attacks - Intrusions
Introduction
Why We Need Intrusion Detection
Models Of Intrusion Detection Anomaly Detection Misuse Detection
How Genetic Algorithm is used in IDS
Conclusion
References
042023 2
System Goals and Preliminary Architecture
Contents FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection
FirewallContentsContentsContentsThe wide spread use of computer networks in todayrsquos society especially the sudden surge in importance of e-commerce to the world economy has made computer network security an international priority Since it is not technically feasible to build a system with no vulnerabilities intrusion detection has become an important area of research
Intelligent intrusion detection system (IIDS) has been developed to demonstrate the effectiveness of data mining techniques that utilize fuzzy logic and genetic algorithms
042023 3
Introduction
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 4
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by
Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse
non-authorized privileges1048714 Typical intrusion scenario
042023 5042023 5
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)
Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion
detectionraise the alarm when possible intrusion happens
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Cyber Attacks - Intrusions
Introduction
Why We Need Intrusion Detection
Models Of Intrusion Detection Anomaly Detection Misuse Detection
How Genetic Algorithm is used in IDS
Conclusion
References
042023 2
System Goals and Preliminary Architecture
Contents FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection
FirewallContentsContentsContentsThe wide spread use of computer networks in todayrsquos society especially the sudden surge in importance of e-commerce to the world economy has made computer network security an international priority Since it is not technically feasible to build a system with no vulnerabilities intrusion detection has become an important area of research
Intelligent intrusion detection system (IIDS) has been developed to demonstrate the effectiveness of data mining techniques that utilize fuzzy logic and genetic algorithms
042023 3
Introduction
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 4
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by
Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse
non-authorized privileges1048714 Typical intrusion scenario
042023 5042023 5
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)
Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion
detectionraise the alarm when possible intrusion happens
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
FirewallContentsContentsContentsThe wide spread use of computer networks in todayrsquos society especially the sudden surge in importance of e-commerce to the world economy has made computer network security an international priority Since it is not technically feasible to build a system with no vulnerabilities intrusion detection has become an important area of research
Intelligent intrusion detection system (IIDS) has been developed to demonstrate the effectiveness of data mining techniques that utilize fuzzy logic and genetic algorithms
042023 3
Introduction
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 4
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by
Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse
non-authorized privileges1048714 Typical intrusion scenario
042023 5042023 5
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)
Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion
detectionraise the alarm when possible intrusion happens
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 4
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by
Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse
non-authorized privileges1048714 Typical intrusion scenario
042023 5042023 5
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)
Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion
detectionraise the alarm when possible intrusion happens
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 5042023 5
Cyber Attack-Intrusion
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)
Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion
detectionraise the alarm when possible intrusion happens
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)
Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion
detectionraise the alarm when possible intrusion happens
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Security mechanisms always have inevitable vulnerabilities
042023 7
Need of Intrusion Detection
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Current firewalls are not sufficient to ensure security in computer networks
ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators
1048714 Insider attacks Multiple levels of data confidentiality in commercial and
government organizations needs multi-layer protection in firewalls
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 8042023 8
The long term goal to design and build an intelligent intrusion detection system that are
Distributed
Real-time
Accurate (low false negative and false positive rates)
Flexible
Adaptive in new environments
Modular with both misuse and anomaly detection components
042023 8
System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Not easily fooled by small variations in intrusion patterns
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 9
Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 10042023 10
Data Mining
for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Misuse detection
Anomaly detection
These models can be more sophisticated and precise than manually created signatures
Unable to detect attacks whose instances have not yet been observed
Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)
Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it
Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)
Partial membership in sets or categories
042023 11
Anomaly Detection via Fuzzy Data Mining
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Automatically learn patterns from large quantities of data
The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection
Fuzzy logic
Data Mining
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Logic
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Suppose one wants to write a rule such as
If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists
Using fuzzy logic a rule like the one shown above could be written as
If the DP = highThen an unusual situation exists
DP is a fuzzy variable and high is a fuzzy set
The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining
Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection
Association Rules
Frequency episodes
Fuzzy Association Rules
Fuzzy Frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Association Rules
Association rules are developed to find correlations in transactions using retail data
For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050
The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Association Rules
This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category
Their method allows a value to contribute to the support of more than one fuzzy set
For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed
An example of a fuzzy association rule from one set of audit data is
SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049
where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions
Fuzzy Association Rules
Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Frequency Episodes
This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes
An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval
Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window
So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent
if frequency(P)n geminfrequency
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Fuzzy Frequency Episodes
The fuzzy frequency episodes involves quantitative attributes in an event
An example of a fuzzy frequency episode given below
E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds
where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2
second period
The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate
This is Integration of fuzzy logic with frequency episodes
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
A simple example of a rule from the misuse detection component is
IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious
Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result
The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules
Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS
APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)
The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set
The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem
Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms
The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set
Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Conclusion
The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions
Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Referrences
Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99
Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada
Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)
Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
042023 28
Queries
042023 28
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION
Recommended