ezCater Accomplishes Early Prediction of LTV with RapidMiner · Data New Data id Order size Day of...

Preview:

Citation preview

ezCater Accomplishes Early Prediction of LTV with

RapidMiner

AgendaIt begins: “The Promotion”

Estimating Lifetime Value: A summary of my Googling

Aside: Why Y1R instead of LTV/CLV?

Getting the Data

Training

What did we learn about machine learning?

Limits of Regression

Enter RapidMiner

1

2

3

4

5

6

7

8

A Bit About ezCater

ezCater is an online marketplace for business and corporate catering. Need food for your

next team lunch? Order it on ezCater!

We’ve raised 169 million to date

We are growing between 2-3x year over year

Working at ezCater is AWESOME and yes,

we are hiring!

We have 400+ employees

A Bit About

@jdwyah

engineering.ezcater.com

Distinguished Software Engineer

Works on growth at ezCater

Not a machine learning expert

Jeff Dwyer

ezCater the business

• ezCater is part marketplace, part SaaS

• Customers love us, thus they keep using us

• Good retention makes the unit economics SaaS-like

• Customer quality > Number of new customers

The Promotion

You’re a SaaS business. You've been acquiring customers and on average they're worth about X. Somebody clever suggests a promotion: "$5 off your first order."

You release it and boom! Conversion rates increase, the number of new customers increase. Yay, right?

The Promotion

But then someone asks that pernicious question "are we sure these are still 'good' customers"?

Put simply "Is the promotion worth it?”

Disclaimer*

*All the numbers in this webinar are made up :)

SaaS Unit Economics

https://www.forentrepreneurs.com/saas-metrics-2/

SaaS Unit Economics

https://www.forentrepreneurs.com/saas-metrics-2/

SaaS Unit Economics

RETENTION. IS.KING.

SaaS Unit Economics

RETENTION. IS.LTV.

A Summary of my Googling

Estimating LTV

Basics

https://blog.profitwell.com/how-to-calculate-ltv-for-saas-the-right-way

Basics

• Totally “correct”

• Totally not actionable

Example

• I have 1000 users at ARPU of $1000

• Idea: Give new users a $500 iPad if they signup and we’ll still make $500!

Example

Customer Quality Matters

Basics

It all boils down to the question of

Given X days of customer data how can I predict their year one revenue and at what accuracy

Naming things mattersY1R vs LTV

LTV is Complex

”LTV” and ”CLV” are loaded terms. They imply:

• Multi-year retention profiles

• Weighted cost of capital

Y1R is SimpleY1R = “Year 1 Revenue”

• Just as actionable as LTV

• No arguing about definitions

• Extensible!

• Y1B = Year 1 Bookings

• Y1M = Year 1 Margin

• EY1R = Estimated Year 1 Revenue

• D180R = Day 180 Revenue

Data

We’re going to need some

Data

Data

Dataid Order

sizeDay of week

Food Type

Time of day

Head-count Location Actual Y1R

1001 103.45 Mon Mex 10 9 MA 500

1002 140.12 Fri BBQ 11 11 NH 200

1003 35.00 Sat Thai 9 3 CA 10

1004 201.12 Mon Mex 12 20 TX 30

1005 55.32 Tue Burg 12 3 MI 14

id Order size

Day of week

Food Type

Time of day

Head-count Location Actual Y1R

1008 93.45 Sun BBQ 9 8 VT

1009 123.99 Sat Burg 14 10 MI

1010 18.00 Mon Mex 9 22 TX

1011 182.12 Tue Mex 16 9 FL

1012 65.32 Tue Burg 12 3 MI

Training

New Data

Data

New Data

id Order size

Day of week

Food Type

Time of day

Head-count Location Actual Y1R

1001 103.45 Mon Mex 10 9 MA 500

1002 140.12 Fri BBQ 11 11 NH 200

1003 35.00 Sat Thai 9 3 CA 10

1004 201.12 Mon Mex 12 20 TX 30

1005 55.32 Tue Burg 12 3 MI 14

id Order size

Day of week

Food Type

Time of day

Head-count Location Actual Y1R

1008 93.45 Sun BBQ 9 8 VT

1009 123.99 Sat Burg 14 10 MI

1010 18.00 Mon Mex 9 22 TX

1011 182.12 Tue Mex 16 9 FL

1012 65.32 Tue Burg 12 3 MI

Training

New Data

DATA THAT LOOKS LIKE THIS IS WHAT ML LOVES

Data

But… the data is not in the warehouse

What is Stitch?

A SaaS platform for consolidating data from a wide array of data sources to data warehouses for analysis.

Getting going with Stitch

• Self-serve setup is easy

• Free plans for smaller data volumes

• Standard paid plans start at $100

• No commitment required

• No incremental cost for additional data sources

A platform with extensible data sources

• Any developer can build an integration

• Send data to Stitch, or another destination

• Existing Stitch integrations run on Singer

• Enables developers to support any use case

Singer is an open-source standard for writing scripts that move data

Stitch is an Enterprise SolutionA simple solution to a complex problem

Data Sources Destinations Open Source

Flexible UI

Instant Connections

Configurable Frequencies

Historical Backload

70+ Integrations Today

Amazon Redshift

Google BigQuery

Snowflake

Panoply

PostgreSQL

Powered by singer.io

Simple & Composable

JSON Based Standard

Extensible by Anyone

Embed in your Application

Platform

Integration Scalability Reliability Security Compliance Extensibility

Better Together J

Read the Stitch - ezCater case study

https://www.stitchdata.com/customers/ezcater-enterprise-etl/

+

Data: Stitch

Data: Stitch Singer

• No waiting for AppsFlyer integration

• No ongoing costs to support AppsFlyer

• Contributing to existing HubSpot integration code

• Less lock-in

• Contractors can contribute

• Singer Slack Channel

Time to let the machine’s learn

Learning

“When starting, always do a simple regression first”

- Everbody

Training: Start Linear

Dataid Order

sizeDay of week

Food Type

Time of day

Head-count Location Actual Y1R

1001 103.45 Mon Mex 10 9 MA 500

1002 140.12 Fri BBQ 11 11 NH 200

1003 35.00 Sat Thai 9 3 CA 10

1004 201.12 Mon Mex 12 20 TX 30

1005 55.32 Tue Burg 12 3 MI 14

id Order size

Day of week

Food Type

Time of day

Head-count Location Actual Y1R

1008 93.45 Sun BBQ 9 8 VT

1009 123.99 Sat Burg 14 10 MI

1010 18.00 Mon Mex 9 22 TX

1011 182.12 Tue Mex 16 9 FL

1012 65.32 Tue Burg 12 3 MI

Training

New Data

ExpectationsThis… isn’t going to be easy

Out of core learning algorithm

• Popular on Kaggle• Free• Gradient descent• Linear• Logistic• Neural

Training: Vowpal Wabbit

Input Format:40 |b event_day_of_week=Mon |c event_local_time:1200

Pros: • Readable input format• Super fast <1 min for 500k• Built in protection against overfitting

Cons: • Analysis totally DIY• Viewing feature weightings wonky• Totally DIY pipeline• Regression not a fabulous fit

Training: Vowpal Wabbit

Input Format:40 |b event_day_of_week=Mon |c event_local_time:1200

Pros: • Readable input format• Super fast <1 min for 500k• Built in protection against overfitting

Cons: • Analysis totally DIY• Viewing feature weightings wonky• Totally DIY pipeline• Regression not a fabulous fit

Training: Vowpal Wabbit

Limits of Regression

Limits of RegressionRegression + co-dependence =

People that order on the weekend are MUCH worsePeople that make fewer orders are worse

BUT

If people make many orders, than it doesn’t matter if they order on the weekend!

Alternatives to Regression

• Gradient boosted trees• Random forests

• Clustering• Neural nets• SO MANY OTHER CHOICES

Alternatives to Regression

• Gradient boosted trees• Random forests• Clustering• Neural nets• SO MANY OTHER CHOICES

TrainingChoices

Install xgboost, learn python & scikit,

or maybe R

See whether there’s something to one of

these machine learning companies

1 2

RapidMiner

RapidMiner

Test 4 totally different techniques on the exact same data!

RapidMiner

Compare ROC curves trivially!

RapidMiner

RapidMiner

• Free trial

• ~1 hour to get my CSV -> Regression outputs

• Super easy to explain pipeline to new colleagues

RapidMiner

~1 hour to get my CSV -> Regression outputs

RapidMiner

Query Redshift right from RapidMiner

RapidMiner

Generate new attributes

RapidMiner

Clear Visual Pipeline

RapidMiner

Simple Analysis

RapidMiner

No lack of detail

RapidMiner

Nice things “just work”

(It works!)

Results

Results

Results

R^2 .23 Spearman’s Rho .648 Accuracy of prediction .82

Results

R^2 .504 Spearman’s Rho .834 Accuracy of prediction .89

Results

R^2 .73 Spearman’s Rho .920 Accuracy of prediction .94

Results

Results

EY1R shows major differences in cohort qualityBlue and Red converted the best… but lost on EY1R

ResultsTop Line Company Metrics

# New Customers / week

New EY1R / week

Before After

Results

# New Customers / week

New EY1R / week

Before After

ChallengesIngo’s 2nd mistake

Life changes underneath you: utm_values

But what does it mean?

KISS

Mixing TimeHorizons

Productionization

1

2

3

4

5

6

Takeaways

Estimating LTV is doable

Never say ”LTV” by itself: Prefer EY1R etc

RapidMiner allows mere mortals to use data science

Gradient Boosted Trees are great. (aka “Listen to YY”)

1

2

3

4

Let’s Connect

@jdwyah

engineering.ezcater.com

Software Engineering Manager

Works on growth at ezCater

Not a machine learning expert

Jeff Dwyer

Recommended