View
67
Download
3
Category
Preview:
Citation preview
Agenda
• Data Mining, Data Science
• Data Analytics in Business
• Big Data
• Data Analytics TechniquesoClassification (Decision tree, ANN, K-nearest neighbor, Bayesian)
oRegression
oCluster
oText Mining
oSVM
oMachine Learning
Data types vs mining methods
• Data types and modelso Flat data tables
oRelational databases
oTemporal & spatial data
oTransactional databases
oMultimedia data
oGenome databases
oMaterials science data
oTextual data
oWeb data
oEtc.
• Mining tasks and methodoClassification / Prediction
Decision trees
Bayesian classification
Neural networks
Rule induction
Support vector machine (SVM)
Hidden Markov Model
Etc
oDescriptionAssociation analysis
Clustering
Summarization
Etc.
Data types
• Symbolico Indexing
oBinary
oBoolean
oNominal
oOrdinal
• Numerico Integer
oContinuous
• Structured vs Unstructured data
• Semi-structured data
• Supervised vs Unsupervised data
Data mining techniques
• Supervised Learning
(predictive ability based on past data)
oClassification Statistics
oDecision Trees
oRegression
oArtificial Neural Networks (ANN)
oClassification machine learning
• Unsupervised Learning
(Exploratory analysis to discover patterns)
oClustering Analysis
oAssociation Rules
Study 1: Classification with decision tree
Outlook Temp Humidity Windy Play
Sunny Hot Normal True ??
Study 2: Regression
• List all the variable available for making the model
• Establish a Dependent Variable (DV) of interest
• Examine visual (if possible) relationships between variables of interest
• Find a way to predict DV using other variables
Recommended