Upload
andrew-quinn
View
214
Download
0
Embed Size (px)
Citation preview
Data Mining with Oracle using Classification and Clustering Algorithms
Presented by Nhamo Mdzingwa
Supervisor: John Ebden
Overview of Presentation
Recap of Proposal Classification of Data Mining & DM Algorithms Oracle Data Mining Data Mining Process Evaluation of Results Progress so far Updated Timeline Plans
Objective
Investigate two types of algorithms available in Oracle10g for data mining (ODM).
Apply the two algorithms to actual data. Analyse & Evaluate results in terms of performance.
Classification of Data Mining
Directed data mining/supervised learning
which build a model that describes one particular attribute in terms of the rest of the data.
Undirected DM / Unsupervised learning
builds a model to establish the relationships amongst all the input attributes by grouping.
Classification of Data Mining algorithms
DM strategies
Unsupervised learning
Supervised learning
ClassificationNaive BayesModel SeekerAdaptive Bayes
Estimation
PredictionPredictive variance
Clusteringk-MeansO-Cluster
Input attributes but have no output attributes
Input attributes and output one or more attributes
Association Discovery
Visualization
Algorithms offered in Oracle10g
classification 1. Adaptive Bayes Network 2. Naive Bayes3. Model Seeker
clustering1. k-Means2. O-Cluster3. Predictive variance
association rules1. Apriori (association rules)
Evaluation of Results
Evaluation of unsupervised learning models involves determining the level of predictive accuracy.
Evaluated using test data sets. Compare confidence and support levels of
models created from the same training data to determine accuracy.
Progress
Literature Survey Oracle10g installed on Athena in Hons Lab Exploring the Oracle9i and 10g Suite
including JDeveloper Member of MetaLink (Oracle’s online support
service)
Updated TimelineContinuation from literature and tutorials
done
Investigate Clustering & Classification
algorithms (theory) done
Find suitable computerised case studies of the use of above algorithms – with or without Oracle.
done
Search datasets for testing (possibilities: AIDS data & faculty data)
In progress
Apply algorithms to data found then Critically Analyse & assess results
Second semester
Write up paper September vacation and 3rd term
Final project write up Due 7/11