30
Data Mining GyuHyeon Choi

Data Mining GyuHyeon Choi. ‘80s When the term began to be used Within the research community

Embed Size (px)

Citation preview

Page 1: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Data MiningGyuHyeon Choi

Page 2: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘80s

When the term began to be used Within the research community

Page 3: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘80s

Definition A set of mechanisms and techniques

To extract hidden information from data

SQL style is not data mining

Page 4: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘90s

Definition Sub-process within KDD

(Knowledge Discovery in Databases)

Different with data preparation, analysis, and visualization Other parts of KDD

Page 5: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘90s

Became popular significantly ACM SIGKDD annual conference, 1995 European PKDD conference, 1995 Pacific/Asia PAKDD conference, 1997

Page 6: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘90s

Contribution of technological advances Processing power Data storage capability

Page 7: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘90s

Processing of large volumes of data Even using desktop machines Commercial enterprises started to maintain data

To support commercial activities But not to mine

Page 8: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

‘90s

Large super market chains Introduction of customer loyalty cards

To record customer purchases Started mining purchasing patterns

Page 9: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Present

Mining non-standard data Text mining Image mining Graph mining

Page 10: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Present

Collective term Different with data preparation, analysis, and visualization Even called as “big data”

Page 11: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Present

Domain of AI and KE Artificial Intelligence Knowledge Engineering

Page 12: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Present

Application Rather than a technology

Page 13: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Before We Go

Data mining techniques Pattern extraction Clustering Classification

Page 14: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Before We Go

Examples of classification

Page 15: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Problem

Curse of dimensionality Data in high-dimensional spaces

Page 16: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Problem

Earlier classification (maybe)

Page 17: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Problem

Current classification

Page 18: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Example

Clustering

Page 19: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Example

Anomaly detection

Page 20: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Example

Recommender system

Page 21: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Problem (again)

Current classification

Page 22: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Topic

Complexity The more complex society The more complex data mining

Page 23: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Dimension Reduction

Use most significant dimensions Cannot satisfy people’s demand Waste of storage

Page 24: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Principal Component Analysis

Orthogonal transformation Computationally expensive Still doubtful to satisfy people’s demand

Page 25: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Next Solution?

Currently no solutions Or no problem

Page 26: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Future Solution

No algorithmic solution

Page 27: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Future Solution

Advanced hardware Super-supercomputer

Page 28: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Future Problem

Human being as data? Dimensions of the human Being

Page 29: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

Future Problem

Can people be satisfied? More and more sophisticated demand

More and more dimensions of data

Page 30: Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community

• Frans Coenen, The Knowledge Engineering Review, Vol. 26:1, 25-29, Cambridge University Press, 2011

• Curse of dimensionality, Wikipedia

• Andrew Tarantola, The Quantum D-Wave 2 Is 3,600 Times Faster than a Super Computer, GIZMODO, April 2015

• Peter Rüst, Dimensions of the Human Being and of Divine Action, www.asa3.org/ASA/PSCF/, Volume 57, Number 3, September 2005