Upload
josephine-elizabeth-golden
View
213
Download
1
Tags:
Embed Size (px)
Citation preview
Data MiningGyuHyeon Choi
‘80s
When the term began to be used Within the research community
‘80s
Definition A set of mechanisms and techniques
To extract hidden information from data
SQL style is not data mining
‘90s
Definition Sub-process within KDD
(Knowledge Discovery in Databases)
Different with data preparation, analysis, and visualization Other parts of KDD
‘90s
Became popular significantly ACM SIGKDD annual conference, 1995 European PKDD conference, 1995 Pacific/Asia PAKDD conference, 1997
‘90s
Contribution of technological advances Processing power Data storage capability
‘90s
Processing of large volumes of data Even using desktop machines Commercial enterprises started to maintain data
To support commercial activities But not to mine
‘90s
Large super market chains Introduction of customer loyalty cards
To record customer purchases Started mining purchasing patterns
Present
Mining non-standard data Text mining Image mining Graph mining
Present
Collective term Different with data preparation, analysis, and visualization Even called as “big data”
Present
Domain of AI and KE Artificial Intelligence Knowledge Engineering
Present
Application Rather than a technology
Before We Go
Data mining techniques Pattern extraction Clustering Classification
Before We Go
Examples of classification
Problem
Curse of dimensionality Data in high-dimensional spaces
Problem
Earlier classification (maybe)
Problem
Current classification
Example
Clustering
Example
Anomaly detection
Example
Recommender system
Problem (again)
Current classification
Topic
Complexity The more complex society The more complex data mining
Dimension Reduction
Use most significant dimensions Cannot satisfy people’s demand Waste of storage
Principal Component Analysis
Orthogonal transformation Computationally expensive Still doubtful to satisfy people’s demand
Next Solution?
Currently no solutions Or no problem
Future Solution
No algorithmic solution
Future Solution
Advanced hardware Super-supercomputer
Future Problem
Human being as data? Dimensions of the human Being
Future Problem
Can people be satisfied? More and more sophisticated demand
More and more dimensions of data
• Frans Coenen, The Knowledge Engineering Review, Vol. 26:1, 25-29, Cambridge University Press, 2011
• Curse of dimensionality, Wikipedia
• Andrew Tarantola, The Quantum D-Wave 2 Is 3,600 Times Faster than a Super Computer, GIZMODO, April 2015
• Peter Rüst, Dimensions of the Human Being and of Divine Action, www.asa3.org/ASA/PSCF/, Volume 57, Number 3, September 2005