View
218
Download
3
Category
Preview:
Citation preview
Result
BUSINESS INTELLIGENCE
DATA MINING- Making data accessible- Wider distribution- Dimensional slicing- Mostly as-is reporting
- Finding useful patterns in data- Limited distribution- Algorithms- Insights and Predictions
DATA MINING
Statistics
Computing Machine Learning
QuantitativeOperations Research
Data StoresComputation
Machine Learning, Optimization, Algorithms
Data Mining in simpler terms, is finding useful patterns in the data.
“It is non-trivial process of finding useful, valid, novel, understandable patterns or relationships in the data to make important decisions” (Fayyad et al., 1996)
Data Mining
Classification
Regression
Clustering
Association
Anomaly detection
Time Series
Feature Selection
Text Mining
DATA MINING: TYPES
Tasks
Applications
Tasks Examples
Classification Assigning voters into known buckets by political parties eg: soccer moms. Bucketing new customers into one of known customer groups.
Regression Predicting unemployment rate for next year. Estimating insurance premium.
Anomaly detection Fraud transaction detection in credit cards. Network intrusion detection.
Time series Sales forecasting, production forecasting, virtually any growth phenomenon that needs to be extrapolated
Clustering Finding customer segments in a company based on transaction, web and customer call data.
Association analysis Find cross selling opportunities for a retailer based on transaction purchase history.
DATA MINING: TYPES
Tasks Algorithms
Classification Decision Trees, Neural networks, Bayesian models, Induction rules, K nearest neighbors
Regression Linear regression, Logistic regression
Anomaly detection Distance based, Density based, LOF
Time series Exponential smoothing, ARIMA, regression
Clustering K means, density based clustering - DBSCAN
Association analysis FP Growth, Apriori
DATA MINING: TYPES
BusinessIntelligence
Data Mining
ISSUES
- People: Skills of data mining and business intelligence are exclusive
- Organization: They live in different organizations within an enterprise
- Technology: Minimal overlap in the tools, platform and technology
- Use cases: History reporting vs. prediction and insights
Data Mining Business
Intelligence
BENEFITS
- Distribution: Data Mining insights will have wider real time distribution
- Smarter Analytics: History + Predictions
- Visual discovery: Common link
- Security: Secure delivery of insights
Star Schema
OLAPStaging
Secu
rity
Laye
r
Dashboards, reports, alerts, ad hoc...
CLASSIC BI ARCHITECTURE
Extr
actio
n Tr
ansf
orm
atio
n &
Load
ing
Star Schema
OLAPStagingDashboards, reports, alerts, ad hoc...
ANALYTICAL ARCHITECTURE #1
Extr
actio
n Tr
ansf
orm
atio
n &
Load
ing
Data Mining Tool
Data Mining tool does the scoring. Robust modeling and scoring capabilities. BI tool reports the scored like any other data points. Limitations: New records cannot be scored, unless scoring is provided by DM tool. Required multiple analytical tools.
Data Mining Tool Scoring
Star Schema
OLAPStagingDashboards, reports, alerts, ad hoc...
ANALYTICAL ARCHITECTURE #2
Extr
actio
n Tr
ansf
orm
atio
n &
Load
ing
Database does the scoring. Can handle large data. Model, scoring and data in one place. Limitations: DB vendors have to provide full DM suite. Analysis Skills
Database Scoring
Star Schema
OLAPStagingDashboards, reports, alerts, ad hoc...
ANALYTICAL ARCHITECTURE #3
Extr
actio
n Tr
ansf
orm
atio
n &
Load
ing
BI platform does the scoring. Good integration between predictive metrics with BI metrics. Security. Distribution. Real time scoring. Limitations: Performance. Limited Functionality
BI Scoring: Native Modeling
Star Schema
OLAPStagingDashboards, reports, alerts, ad hoc...
ANALYTICAL ARCHITECTURE #4
Extr
actio
n Tr
ansf
orm
atio
n &
Load
ing
Data Mining Tool
BI platform does the scoring. Modeled by DM tool and imported in BI platform. Real time scoring. Supports wide selection of algo. Limitations: Performance.
BI Scoring: Data Mining Tool Modeling
PMML Model
ANALYTICAL ARCHITECTURE
Data Mining Tool Scoring
Database Scoring
BI Scoring
- Data Mining Tool Modeling
- Native Modeling
CLICKSTREAM DATA
Can be generalized to transactionsApplies to any product purchases in an enterprise
BI VS. DATA MINING THINKING
Number of customers lost last month Who will most likely churn in next 10 days
Production downtime report What part of process will fail and mitigation
ROI for Marketing Campaigns Whats the next action will the prospect make
Yesterday’s revenue Tomorrow’s
RECOMMENDED READING
OPEN SOURCE DATA MINING TOOLS
Advanced Reporting
Guide:Enhancing
Your Business
Intelligence
Recommended