Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
ezCater Accomplishes Early Prediction of LTV with
RapidMiner
AgendaIt begins: “The Promotion”
Estimating Lifetime Value: A summary of my Googling
Aside: Why Y1R instead of LTV/CLV?
Getting the Data
Training
What did we learn about machine learning?
Limits of Regression
Enter RapidMiner
1
2
3
4
5
6
7
8
A Bit About ezCater
ezCater is an online marketplace for business and corporate catering. Need food for your
next team lunch? Order it on ezCater!
We’ve raised 169 million to date
We are growing between 2-3x year over year
Working at ezCater is AWESOME and yes,
we are hiring!
We have 400+ employees
A Bit About
@jdwyah
engineering.ezcater.com
Distinguished Software Engineer
Works on growth at ezCater
Not a machine learning expert
Jeff Dwyer
ezCater the business
• ezCater is part marketplace, part SaaS
• Customers love us, thus they keep using us
• Good retention makes the unit economics SaaS-like
• Customer quality > Number of new customers
The Promotion
You’re a SaaS business. You've been acquiring customers and on average they're worth about X. Somebody clever suggests a promotion: "$5 off your first order."
You release it and boom! Conversion rates increase, the number of new customers increase. Yay, right?
The Promotion
But then someone asks that pernicious question "are we sure these are still 'good' customers"?
Put simply "Is the promotion worth it?”
Disclaimer*
*All the numbers in this webinar are made up :)
SaaS Unit Economics
https://www.forentrepreneurs.com/saas-metrics-2/
SaaS Unit Economics
https://www.forentrepreneurs.com/saas-metrics-2/
SaaS Unit Economics
RETENTION. IS.KING.
SaaS Unit Economics
RETENTION. IS.LTV.
A Summary of my Googling
Estimating LTV
Basics
https://blog.profitwell.com/how-to-calculate-ltv-for-saas-the-right-way
Basics
• Totally “correct”
• Totally not actionable
Example
• I have 1000 users at ARPU of $1000
• Idea: Give new users a $500 iPad if they signup and we’ll still make $500!
Example
Customer Quality Matters
Basics
It all boils down to the question of
Given X days of customer data how can I predict their year one revenue and at what accuracy
Naming things mattersY1R vs LTV
LTV is Complex
”LTV” and ”CLV” are loaded terms. They imply:
• Multi-year retention profiles
• Weighted cost of capital
Y1R is SimpleY1R = “Year 1 Revenue”
• Just as actionable as LTV
• No arguing about definitions
• Extensible!
• Y1B = Year 1 Bookings
• Y1M = Year 1 Margin
• EY1R = Estimated Year 1 Revenue
• D180R = Day 180 Revenue
Data
We’re going to need some
Data
Data
Dataid Order
sizeDay of week
Food Type
Time of day
Head-count Location Actual Y1R
1001 103.45 Mon Mex 10 9 MA 500
1002 140.12 Fri BBQ 11 11 NH 200
1003 35.00 Sat Thai 9 3 CA 10
1004 201.12 Mon Mex 12 20 TX 30
1005 55.32 Tue Burg 12 3 MI 14
id Order size
Day of week
Food Type
Time of day
Head-count Location Actual Y1R
1008 93.45 Sun BBQ 9 8 VT
1009 123.99 Sat Burg 14 10 MI
1010 18.00 Mon Mex 9 22 TX
1011 182.12 Tue Mex 16 9 FL
1012 65.32 Tue Burg 12 3 MI
Training
New Data
Data
New Data
id Order size
Day of week
Food Type
Time of day
Head-count Location Actual Y1R
1001 103.45 Mon Mex 10 9 MA 500
1002 140.12 Fri BBQ 11 11 NH 200
1003 35.00 Sat Thai 9 3 CA 10
1004 201.12 Mon Mex 12 20 TX 30
1005 55.32 Tue Burg 12 3 MI 14
id Order size
Day of week
Food Type
Time of day
Head-count Location Actual Y1R
1008 93.45 Sun BBQ 9 8 VT
1009 123.99 Sat Burg 14 10 MI
1010 18.00 Mon Mex 9 22 TX
1011 182.12 Tue Mex 16 9 FL
1012 65.32 Tue Burg 12 3 MI
Training
New Data
DATA THAT LOOKS LIKE THIS IS WHAT ML LOVES
Data
But… the data is not in the warehouse
What is Stitch?
A SaaS platform for consolidating data from a wide array of data sources to data warehouses for analysis.
Getting going with Stitch
• Self-serve setup is easy
• Free plans for smaller data volumes
• Standard paid plans start at $100
• No commitment required
• No incremental cost for additional data sources
A platform with extensible data sources
• Any developer can build an integration
• Send data to Stitch, or another destination
• Existing Stitch integrations run on Singer
• Enables developers to support any use case
Singer is an open-source standard for writing scripts that move data
Stitch is an Enterprise SolutionA simple solution to a complex problem
Data Sources Destinations Open Source
Flexible UI
Instant Connections
Configurable Frequencies
Historical Backload
70+ Integrations Today
Amazon Redshift
Google BigQuery
Snowflake
Panoply
PostgreSQL
Powered by singer.io
Simple & Composable
JSON Based Standard
Extensible by Anyone
Embed in your Application
Platform
Integration Scalability Reliability Security Compliance Extensibility
Better Together J
Read the Stitch - ezCater case study
https://www.stitchdata.com/customers/ezcater-enterprise-etl/
+
Data: Stitch
Data: Stitch Singer
• No waiting for AppsFlyer integration
• No ongoing costs to support AppsFlyer
• Contributing to existing HubSpot integration code
• Less lock-in
• Contractors can contribute
• Singer Slack Channel
Time to let the machine’s learn
Learning
“When starting, always do a simple regression first”
- Everbody
Training: Start Linear
Dataid Order
sizeDay of week
Food Type
Time of day
Head-count Location Actual Y1R
1001 103.45 Mon Mex 10 9 MA 500
1002 140.12 Fri BBQ 11 11 NH 200
1003 35.00 Sat Thai 9 3 CA 10
1004 201.12 Mon Mex 12 20 TX 30
1005 55.32 Tue Burg 12 3 MI 14
id Order size
Day of week
Food Type
Time of day
Head-count Location Actual Y1R
1008 93.45 Sun BBQ 9 8 VT
1009 123.99 Sat Burg 14 10 MI
1010 18.00 Mon Mex 9 22 TX
1011 182.12 Tue Mex 16 9 FL
1012 65.32 Tue Burg 12 3 MI
Training
New Data
ExpectationsThis… isn’t going to be easy
Out of core learning algorithm
• Popular on Kaggle• Free• Gradient descent• Linear• Logistic• Neural
Training: Vowpal Wabbit
Input Format:40 |b event_day_of_week=Mon |c event_local_time:1200
Pros: • Readable input format• Super fast <1 min for 500k• Built in protection against overfitting
Cons: • Analysis totally DIY• Viewing feature weightings wonky• Totally DIY pipeline• Regression not a fabulous fit
Training: Vowpal Wabbit
Input Format:40 |b event_day_of_week=Mon |c event_local_time:1200
Pros: • Readable input format• Super fast <1 min for 500k• Built in protection against overfitting
Cons: • Analysis totally DIY• Viewing feature weightings wonky• Totally DIY pipeline• Regression not a fabulous fit
Training: Vowpal Wabbit
Limits of Regression
Limits of RegressionRegression + co-dependence =
People that order on the weekend are MUCH worsePeople that make fewer orders are worse
BUT
If people make many orders, than it doesn’t matter if they order on the weekend!
Alternatives to Regression
• Gradient boosted trees• Random forests
• Clustering• Neural nets• SO MANY OTHER CHOICES
Alternatives to Regression
• Gradient boosted trees• Random forests• Clustering• Neural nets• SO MANY OTHER CHOICES
TrainingChoices
Install xgboost, learn python & scikit,
or maybe R
See whether there’s something to one of
these machine learning companies
1 2
RapidMiner
RapidMiner
Test 4 totally different techniques on the exact same data!
RapidMiner
Compare ROC curves trivially!
RapidMiner
RapidMiner
• Free trial
• ~1 hour to get my CSV -> Regression outputs
• Super easy to explain pipeline to new colleagues
RapidMiner
~1 hour to get my CSV -> Regression outputs
RapidMiner
Query Redshift right from RapidMiner
RapidMiner
Generate new attributes
RapidMiner
Clear Visual Pipeline
RapidMiner
Simple Analysis
RapidMiner
No lack of detail
RapidMiner
Nice things “just work”
(It works!)
Results
Results
Results
R^2 .23 Spearman’s Rho .648 Accuracy of prediction .82
Results
R^2 .504 Spearman’s Rho .834 Accuracy of prediction .89
Results
R^2 .73 Spearman’s Rho .920 Accuracy of prediction .94
Results
Results
EY1R shows major differences in cohort qualityBlue and Red converted the best… but lost on EY1R
ResultsTop Line Company Metrics
# New Customers / week
New EY1R / week
Before After
Results
# New Customers / week
New EY1R / week
Before After
ChallengesIngo’s 2nd mistake
Life changes underneath you: utm_values
But what does it mean?
KISS
Mixing TimeHorizons
Productionization
1
2
3
4
5
6
Takeaways
Estimating LTV is doable
Never say ”LTV” by itself: Prefer EY1R etc
RapidMiner allows mere mortals to use data science
Gradient Boosted Trees are great. (aka “Listen to YY”)
1
2
3
4
Let’s Connect
@jdwyah
engineering.ezcater.com
Software Engineering Manager
Works on growth at ezCater
Not a machine learning expert
Jeff Dwyer