10
Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzing Supervisor: John Ebden

Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Embed Size (px)

Citation preview

Page 1: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Data Mining with Oracle using Classification and Clustering Algorithms

Presented by Nhamo Mdzingwa

Supervisor: John Ebden

Page 2: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Overview of Presentation

Recap of Proposal Classification of Data Mining & DM Algorithms Oracle Data Mining Data Mining Process Evaluation of Results Progress so far Updated Timeline Plans

Page 3: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Objective

Investigate two types of algorithms available in Oracle10g for data mining (ODM).

Apply the two algorithms to actual data. Analyse & Evaluate results in terms of performance.

Page 4: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Classification of Data Mining

Directed data mining/supervised learning

which build a model that describes one particular attribute in terms of the rest of the data.

Undirected DM / Unsupervised learning

builds a model to establish the relationships amongst all the input attributes by grouping.

Page 5: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Classification of Data Mining algorithms

DM strategies

Unsupervised learning

Supervised learning

ClassificationNaive BayesModel SeekerAdaptive Bayes

Estimation

PredictionPredictive variance

Clusteringk-MeansO-Cluster

Input attributes but have no output attributes

Input attributes and output one or more attributes

Association Discovery

Visualization

Page 6: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Algorithms offered in Oracle10g

classification 1. Adaptive Bayes Network 2. Naive Bayes3. Model Seeker

clustering1. k-Means2. O-Cluster3. Predictive variance

association rules1. Apriori (association rules)

Page 7: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden
Page 8: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Evaluation of Results

Evaluation of unsupervised learning models involves determining the level of predictive accuracy.

Evaluated using test data sets. Compare confidence and support levels of

models created from the same training data to determine accuracy.

Page 9: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Progress

Literature Survey Oracle10g installed on Athena in Hons Lab Exploring the Oracle9i and 10g Suite

including JDeveloper Member of MetaLink (Oracle’s online support

service)

Page 10: Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Updated TimelineContinuation from literature and tutorials

done

Investigate Clustering & Classification

algorithms (theory) done

Find suitable computerised case studies of the use of above algorithms – with or without Oracle.

done

Search datasets for testing (possibilities: AIDS data & faculty data)

In progress

Apply algorithms to data found then Critically Analyse & assess results

Second semester

Write up paper September vacation and 3rd term

Final project write up Due 7/11