Machine Learning Smackdown
Mark TabladilloLynn Langit
May 7-9, 2014 | San Jose, CA
Please silence
cell phones
Agenda
Goal: Survey ML tools/methods that you can actually use on the Microsoft stack
• Definitions• Tools I – Understanding 3rd party Excel Machine Learning Add-ins• Tools II – Using the Microsoft SQL Server SSAS & Data Mining Add-
ins• Tools III – Using Predixion Software • Recap and Call To Action
3
TermsGoal: Create common definitions of key terms
• Business Analytics• Query • Aggregation
• Predictive Analytics• Machine Learning• Statistics• Unsupervised Data Mining• Supervised Data Mining• Other
4
What does the market look like now?
5
57%28%
10%6%
Regular AnalyticsUnsupervised DMSupervised DMMachine Learning
CRISP DM Lifecycle applied to ML
6
7
Machine Learning – an Example
8
An aside…about R Language
Using R
About 3rd party Excel Machine Learning Add-insWhat are they? Toolbars in Excel – many different offerings
• XLMiner• StatsMiner• XLStat• RExcel
10
Important: All of these tools assume expert statistical knowledge
Viewing 3rd Party Add-ins XLMiner
About the Data Mining Add-ins For ExcelWhat is it? Free add-ins which add menus to use SSAS Analysis Services Data Mining
• Table Analysis Tools for Excel• Use mining models with Excel data or external data
• Data Mining Client for Excel• Create/test/explore/manage Mining Models
• Data Mining Templates for Visio• Render/share mining models as Visio Drawings
12
Important: Use requires connection to SQL Server 2012 SSAS
Using the Data Mining Add-ins
for Excel
DEMO
Checking Understanding…
Data Mining Structures• Containers for cleansed source data
Data Mining Models• Child containers for source data
plus one mining algorithm• SSAS Algorithms - Clustering, Time
Series Prediction, Market-Basket Analysis, Text Mining and Neural Networks
Model Verification, Processing and Usage Tools• Model query, Model processing
14
About Predixion SoftwareWhat is it? Suite of tools for predictive analytics
• Insight Now• Use mining models with Excel data or external data
• Insight Analytics• Create/test/explore/manage Mining Models
• Insight Workbench• Prepare data for model creation
• Web-based Viewers and Tools
15
Important: Runs as EITHER connected to SSAS on premise OR Connected to Predixion’s cloud-based servers
Using Predixion Software
DEMO
17
Understanding options…
18
Add-inServer Required
Complexity of install
OtherCost of Add-in
Cost of Solution
XLMiner none easy Assumes stats expertise
$$ $$
RExcel none easy Assumes R expertise $ $
Data Mining Add-ins
SQL Server SSAS
medium Designed for single user
0 $$$
Predixion on premise
SQL Express easy Requires local R install 0 $$-$$$
Predixion on premise
SQL Server SSAS
medium Your data is stored locally
0 $$$$
Predixion cloud none easy Supports SSAS Data Mining AND R Language
0 $$-$$$
19
Machine Learning Skills
Data Scientist
Store
Clean
Aggregate
ML Engineer
Selects Libraries
Applies Algorithms
Creates Solutions
ML Researcher
Creates Algorithms
20
Learning Paths – ML Developers
• Learn a language… DMX, PAX, R, Mahout, Julia• Pick your IDE, tools… SSAS, Predixion, R-Studio,
Weka• Pick a problem space… Marketing, Health, Financial• Find (purchase)/gather/prepare some data…
GO!
(Visualize results)
21
Call to Action – ML Decision Makers
• Pick one or more solutions
• Gather source data
• Prepare source data
• Try out some data mining algorithms
Evaluate it Understand it• Tooling
• Learning
• Data gathering/ preparation
• Storage / hosting
• Results
www.TeachingKidsProgramming.org• Free Courseware (Java, Small Basic or C# [on Pluralsight])• Do a Recipe Teach a Kid (Ages 10 ++)
• recipes)
Q & A ?
Session Evaluations
Submit by 5pmFriday May 9 to WIN prizes
Your feedback is important and valuable.
ways to access
Go to passbac2014/evals
Download the PASS EVENT App from your App Store and search: PASS BAC 2014
Follow the QR code link displayed on session signage throughout the conference venue and in the program guide
for attending this session and the PASS Business Analytics Conference 2014
May 7-9, 2014 | San Jose, CA
ThankYou
SoCalDevGal on