View
219
Download
0
Category
Preview:
Citation preview
8/11/2019 DC Meet Second
1/21
Presented By
K.Indira
Under the Guidance of
Dr. S. Kanmani
Professor & HeadDepartment of Information Technology
Pondicherry Engineering College
1
Optimal Distributive Genetic Algorithm for
Mining Association Rules.
8/11/2019 DC Meet Second
2/21
2
To propose and implement Self adaptive
Distributive Genetic Algorithm for Association
Rule Mining.
Objective
8/11/2019 DC Meet Second
3/21
Extraction of interesting information orpatterns from data in large databases is knownas data mining.
Data Mining
3
8/11/2019 DC Meet Second
4/21
Association analysis is the discovery of what are
commonly called association rules.
It studies the frequency of items occurring together in
transactional databases
Association rule mining provides valuable
information in assessing significant correlations.
ASSOCIATION ANALYSIS
4
8/11/2019 DC Meet Second
5/21
5
Association Rules
Find all the rulesX Ywithminimum support andconfidence
Support, s, probability that a
transaction contains X Y Confidence, c,conditional
probability that a transactionhaving X also contains Y
Let minsup = 50%, minconf = 50%
Freq. Pat.: Milk:3, Nuts:3, Sugar:4, Eggs:3,{Milk, Sugar}:3
Customer
buys sugar
Customer
buys both
Customer
buys milk
Nuts, Eggs, Bread40
Nuts, Coffee, Sugar , Eggs, Bread50
Milk, Sugar, Eggs30
Milk, Coffee, Sugar20
Milk, Nuts, Sugar10
Items boughtTid
Association rules: Milk Sugar (60%, 100%)
Sugar Milk (60%, 75%)
8/11/2019 DC Meet Second
6/21
GENETIC ALGORITHM
A Genetic Algorithm (GA) is a procedure used to
find approximate solutions to search problems
through the application of the principles of
evolutionary biology.
Genetic algorithms use biologically inspiredtechniques such as genetic inheritance, natural
selection, mutation, and sexual reproduction
(recombination, or crossover).
6
8/11/2019 DC Meet Second
7/217
START
GENERATE INITIALPOPULATION
EVALUATION
GENETIC OPERATORS(CROSSOVER, MUTATION)
SELECTION
STOP
TERMINALCONDITION
No
Yes
Conceptual Algorithm
8/11/2019 DC Meet Second
8/21
8/11/2019 DC Meet Second
9/21
8/11/2019 DC Meet Second
10/21
I.IV MutationLocus point of mutation
Weight factor taken into consideration for deciding locus pointDynamic mutation pointMutation 1 and Mutation 2 generated
I.V Fitness ThresholdDynamically setTP,TN, FP,FN criteria consideredStrength of implication taken into considerationSustainability index, creditable index and inclusive index considered
Real values of Confidence and Support derived and appliedPredictability and Comprehensibility factors considered.
10
Existing Work Contd..
8/11/2019 DC Meet Second
11/21
2. MethodologyCrossover replaced by symbiotic combination
Rules selection performed by user thereby seeding populationto next generationSearching for rules in K- itemset instead of whole databaseDistributed GA performedDynamic immune evolution and biometric mechanismintroduced
3. Application Areas
4. Evaluation Parameters.Population Size
Chromosome LengthMutation ProbabilityCrossover probabilityFitness thresholdSupport and Confidence Factor
11
Existing Work Contd..
8/11/2019 DC Meet Second
12/21
Open Issues
Mining Rules with non fixed consequent.
Combined with other methods for multi-relation data.
Elimination of redundant rules.
Fixing optimum values for parameters.
Enhance self addictiveness.
Rule selection made dependent on other classes.
Algorithm could be improved to generate further
simpler rules.
Test on different domain.Complexity prediction by using Distributed Computing.
Scalability.
Unsupervised Learning.
12
8/11/2019 DC Meet Second
13/21
Proposed Work
To implement self adaptive Genetic Algorithm forAssociation Rule Mining with optimal accuracy.
By Iterative Approach to increase the number of rules
extracted in each iteration, as a way to decrease thetime for learning.
To propose the Self Adaptive GA in Distributive
Environment.
13
8/11/2019 DC Meet Second
14/21
Self Adaptive GA
SELFADAPTIVE
8/11/2019 DC Meet Second
15/21
Work Done So Far
Literature survey performed on genetic algorithm and
comparative study based on other methods done .
Analysis on Existing Rule mining method : Apriori done
Basic Genetic Algorithm for optimizing function coded
in Java.
Proposed a comparison framework on Genetic
algorithm in Association Rule Mining.
15
8/11/2019 DC Meet Second
16/21
Work to be done
Implementing association rule mining with self
adaptive Genetic Algorithm on medical dataset.
Test the same algorithm on other dataset and
compare with existing methods.
Optimize result with GA parameters.
Survey on Distributed Algorithm.
16
8/11/2019 DC Meet Second
17/21
Execution Plan for Next Six Months
July-August Implementing an existing paper
August - Testing the code with Medical data set and
perform comparative study
September - Alter the code for other datasets and compare
the result obtained
October - Make alteration in GA factors in code & evaluate
the results
November - Feasibility study on generated code to obtain
Decembers optimum result.
17
8/11/2019 DC Meet Second
18/21
Papers Published
Paper titled Framework for Comparison of Association Rule
Mining Using Genetic Algorithm has been selected for The
International Conference On Computers, Communication &
Intelligence at VCET, Madurai.
18
f
8/11/2019 DC Meet Second
19/21
References Jing Li, Han Rui-feng, A Self-Adaptive Genetic Algorithm Based On Real-
Coded, International Conference on Biomedical Engineering and
computer Science , Page(s): 1 - 4 , 2010
Chuan-Kang Ting, Wei-Ming Zeng, Tzu-Chieh Lin, Linkage Discovery
through Data Mining, IEEE Magazine on Computational Intelligence,
Volume 5, February 2010.
Caises, Y., Leyva, E., Gonzalez, A., Perez, R., An extension of the Genetic
Iterative Approach for Learning Rule Subsets , 4th International Workshopon Genetic and Evolutionary Fuzzy Systems, Page(s): 63 - 67 , 2010
Shangping Dai, Li Gao, Qiang Zhu, Changwu Zhu, A Novel Genetic
Algorithm Based on Image Databases for Mining Association Rules, 6th
IEEE/ACIS International Conference on Computer and Information Science,
Page(s): 977980, 2007
Peregrin, A., Rodriguez, M.A., Efficient Distributed Genetic Algorithm for
Rule Extraction,. Eighth International Conference on Hybrid Intelligent
Systems, HIS '08. Page(s): 531536, 2008
19
8/11/2019 DC Meet Second
20/21
20
Mansoori, E.G., Zolghadri, M.J., Katebi, S.D., SGERD: A Steady-State
Genetic Algorithm for Extracting Fuzzy Classification Rules From
Data, IEEE Transactions on Fuzzy Systems, Volume: 16 , Issue: 4 ,
Page(s): 10611071, 2008..
Xiaoyuan Zhu, Yongquan Yu, Xueyan Guo, Genetic Algorithm Based on
Evolution Strategy and the Application in Data Mining, First
International Workshop on Education Technology and Computer Science,
ETCS '09, Volume: 1 , Page(s): 848852, 2009
Hong Guo, Ya Zhou, An Algorithm for Mining Association Rules Based
on Improved Genetic Algorithm and its Application, 3rd International
Conference on Genetic and Evolutionary Computing, WGEC '09, Page(s):
117120, 2009
Genxiang Zhang, Haishan Chen, Immune Optimization Based Genetic
Algorithm for Incremental Association Rules Mining, International
Conference on Artificial Intelligence and Computational Intelligence, AICI
'09, Volume: 4, Page(s): 341345, 2009
References Contd..
8/11/2019 DC Meet Second
21/21
21
hank You
Recommended