Upload
turi-inc
View
368
Download
3
Embed Size (px)
Citation preview
Dato Confidential
Hello my name is
Neel KishanTechnical Sales Lead(former neuroscientist, GPU programmer, Eagle Scout, Chicago sports fan)
2
[email protected]’s Schedule a Time to Talk: https://calendly.com/dato-neel
Dato Confidential
We empower developers to create intelligent applications with real-time machine learning servicesquickly and easily.
IntelligentApplications
Dato Platform
GraphLabCreate
DatoPredictive Services
MachineLearningLifecycle
Dato Confidential4
Teams have found ways to build intelligent applications…
RecommendersLead Scoring
Churn Prediction
Multi-channel TargetingAuto-Summarization
Fraud detectionIntrusion Detection
Demand Forecasting
Data MatchingFailure Prediction
Dato Confidential5
Why do these projects take so long?• Lengthy code rewrites for scalable production
services
• Mundane tasks to integrate libraries, transform data to specific formats, fill in missing values, etc.
• Many tools are just slow
Dato Confidential6
Challenges for developing intelligent apps• Algorithm-centric APIs create confusion and a
steep learning curve
• Understanding models has been a craft passed only through tribal knowledge
• Production services are hard to maintain and manage
Dato Confidential
Intuitive APIsEasy to learn with smart defaults so your first application comes together fast
Deploy instantly as RESTEliminates the lengthy rewrites to integrate and serve live, at scale
Integrated libraries for any dataDeep learning, graphs, text, and images on a common scalable data structure eliminates all the glue code and context switching
Dato Machine Learning Built to rapidly deliver intelligent applications
Dato Confidential8
What makes Dato special?
Dato Confidential
The Dato Machine Learning PlatformDeploy Models
Feedback
GraphLab Create & Dato Distributed
TrainDevelop
Experiments
Dato Predictive Services
Serve(REST API)
Monitor
www.
on your infrastructure:
GraphLab Create & Dato Distributed• Creating models• Data engineering• Evaluation &
Visualization
Predictive Services• Serving models• Live experimentation• Model management
Dato Confidential10
Scalable Data Structures for Machine Learning
User Com.
Title Body
User Disc.
SFrame - on-disk, columnar & partitioned table
SGraph – graph structure composed of multiple tables
TimeSeries – table with a time index
Dato Confidential11
High performance machine learning
0 2 4 6 8 10 120.60%
0.65%
0.70%
0.75%
0.80%
0.85%
Time(hr)
Test
Erro
r H2O.ai: 10 machines/80 coresDato - 4 min on 4
GPU
recommenders deep learning & images graph analytics
Faster algorithms accelerate teams
Fails to complete on other systems!
Dato Confidential12
Intuitive API – Easily create a live machine learning service
import graphlab as gl data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(
data,
user_id='user',
item_id='movie’,
target='rating') recommendations = model.recommend(k=5)
cluster = gl.deploy.load(‘s3://path’)cluster.add(‘servicename’, model)
Create a Recommender
5 lines of code
Toolkit w/auto selection
Deploy in minutes
Dato Confidential13
Dato Machine Learning ToolkitsApplications• recommender• sentiment_analysis• churn_predictor• data_matching• pattern_mining• anomaly_detection
Fundamentals• regression• classifier• nearest_neighbors• clustering• deeplearning• text_analytics• graph_analytics
Utilities• model_parameter_search• cross_validation• evaluation• comparison• feature_engineering
Join us April 7th for a webinar on Deep Learning: Image Similarity and Beyond
Dato Confidential14
Demo of GLC & PS
Dato Confidential
Deployment scenarios
15
Dato Confidential
AppendixAnd Supporting Material
Dato Confidential
Dato is becoming the backbone of intelligent applications for 80+ customers• Commercialization of Carnegie Mellon ML Project founded by
Professor Carlos Guestrin in 2013• Vibrant user community numbering 40,000+ from Coursera and
open source projects• Major customers in retail, finance, media, and software
18
Dato Confidential19
Appendix
1919
Deployment Scenarios & Pricing
Dato Confidential20
Machine Learning Deployment Options
Dato Predictive Services
Batch write of predictions
Embedded process or script
Export (e.g. PMML)
Dato Confidential
Pricing
• Subscription license which includes support and and upgrades
• Licensed by user for Create & by machine for production use
• Training & technical services also available
21
Dato Confidential222222
Use Cases
Dato Confidential23
Our customers are leading the creation of intelligent applications
Dato Confidential
Quantifying the value – Fastest to Production & Reduced Operational Cost
Built a 90% accurate sentiment analyzer for hotel reviews after 30 minutes of trying Dato’s GraphLab Create
Created an efficient (40 mins in Dato vs. 33 days in R) pipeline with 46% lift in accuracy
“[Dato’s] GraphLab CreateTM gives us easy access to some of the most advanced machine learning and this lets us iterate on our ideas faster”
24
Simplify the process to develop and deploy internal services for SalesForce PDS and adjacent teamsReduced hundreds of tools to manage, complexity of solution, and development time
Achieved in 2 days with Dato’s GraphLab Create what took 2 weeks in R Dropped concept to deployment from months to minutes
Replace a heuristic heavy job ranking system to improve job search relevanceDeveloped in weeks with significant increase in clickthrough after years of no growth
Dato Confidential
Fraud Detection and Security
“Merchant intelligence for safer, more profitable commerce.”
Others like Alan & G2 Web Services:
Alan Krumholz, Principal Data Scientist
Score merchants based on their web presence and actions to help their banking customers identify fraudulent merchants.
Accelerate business decisions, reducing manual intervention required and minimizing false positives.
Achieved in 2 days with GraphLab Create what took two weeks in R. Dropped deployment from months to minutes.
WHO:
INSPIRATION:
VALUE:
OUTCOME:
Customer Success Story
25
Dato Confidential
Data MatchingCustomer Success Story
“Fast, free, thorough home search.”
Others like Nick & Zillow:
Nicholas McClure, Senior Data Scientist
Build a service that matches property listings across many inbound data feeds and collapses to a most accurate listing.
Data & listing quality is critical to Zillow’s core product.
Created an efficient (40 mins in GLC vs. 33 day R pipeline) pipeline with much higher accuracy (95% up from 65%).
WHO:
INSPIRATION:
VALUE:
OUTCOME:
26
Dato Confidential
RecommendersCustomer Success Story
They are the site for “Advice and support on pregnancy and parenting.”
Others like Shelley & BabyCenter:
Shelley Klopp, DBA & Chief Architect
Build and deploy their first recommender to increase session engagement by recommending relevant content
Initial model increased average session by multiple page views
First prototype built in < 1 weekOngoing model experimentation is increasing engagement
WHO:
INSPIRATION:
VALUE:
OUTCOME:
27
Dato Confidential
Sentiment and Text AnalysisCustomer Success Story
“Get hired. Love your job.”
Others like Marcos and Glassdoor:
Marcos Sainz, Lead Machine Learning Engineer
Replace a heuristic heavy job ranking system with an ML driven system to improve job search relevance
More relevant jobs led to happier users and higher clickthrough
Concept to production in weeks
WHO:
INSPIRATION:
VALUE:
OUTCOME:
28
Dato Confidential
Image analytics and Deep featuresCustomer Success Story
“Smart waste management.”
Others like Ben & Compology:
Ben Chehebar, Co-founder/Lead of Product
Use machine learning to predict how full dumpsters are.
This allows them to augment their human classification using mechanical turk and allows them to scale their operations.
Concept to deployed service in less than a month with accuracy as good or better than the humans.
WHO:
INSPIRATION:
VALUE:
OUTCOME:
29