Upload
michael-ford
View
219
Download
2
Tags:
Embed Size (px)
Citation preview
Predictive Modeling for E-Mail MarketingArthur Middleton Hughes – Senior Strategist
Anna Lu - Director of Research and Analytics
Predictive Analytics World Feb 18, 2009
What Does E-mail Marketing Do?
Produces online sales – in many cases
Produces retail sales – in many more cases
Produces customer retention and loyalty
Helps to acquire new customers
Announces new products
Creates cross-sales and upgrades
Can be the most powerful and cost effective marketing method that marketers have available today -- particularly in an economic downturn
2
E-mail’s Role Not Understood
In many companies, e-mail is not recognized as the marketing powerhouse that it is
It is somewhere off on the side, producing Web sales which are about 3% or less of total sales
That may be the perception, but companies that think that way are missing the boat
Here is the reality…
3
E-mail Produces Four Times as Much Offline as Online
4
The value of multi-channel customers E-mail marketing budgets are often based only on
online sales
This is a mistake, because e-mail produces four times as many sales offline as they do online
Calculate the true effect of e-mail so that the marketing budget can reflect the true worth of e-mail marketing
5
E-mail Influences all channels
6
Predictive models seldom used
Most e-mail marketers today do not use predictive modeling. Why not?• Predictive modeling is used in Direct Mail where the CPM is
$600 or more. In e-mail marketing the CPM is $8 or less. Many marketers feel that the savings from a model would not pay for the model.
• Many e-mail marketers are young people who have never heard of predictive modeling
• The philosophy is: “Mail ‘em all. Someone is going to buy…”
• This attitude is beginning to change. Here’s why….
7
Email open rates are falling
8
People are unsubscribing
It costs between $10 and $40 to acquire a permission based subscriber e-mail address.
Inboxes today are so crowded with e-mails that millions unsubscribe or delete e-mails en masse without reading them.
A relevant email to a good customer gets lost in the spam.
Many marketers are mailing too often
The annual loss from unsubscribers from large mailers comes to millions of dollars
9
Predicting the unsubscribers
Unsubscribe rates are often 3% or more per month.
If a mailer has 4 million subscribers, and the value of each subscriber is $15, he could be losing $21 million per year.
If the unsubscribe rate could be reduced by 10% he would save $2.1 million per year.
You could pay for several predictive models with that kind of saving.
10
Finding Likely Unsubs with CHAID
Case Study: Loyalty program for a major US low cost airline
11
Program Background
Frequent flyer program for a major low cost airline in US
Semi-weekly e-mail program offered to members who wish to accumulate "points" they can put towards flights, SkyMall products and more
E-mail drives a significant percentage of the total revenue
12
Business Problem
Status% of
Program Base
% of Revenue
Generated (Lifetime)
% of E-Mail Revenue (2
yrs.)
% of E-mail Revenue (12
months)
Mailable 81.5% 70.6% 78.7% 82.4%
Opt-out 18.5% 29.4% 21.3% 17.6%
13
Objective
Understand key characteristics of previous opt-outs
Identify likely unsubs
Initiate save programs to prevent unsubs from happening
14
Analysis Background
Random sample of 5% of member base
Approx 50 predictor variables
• Program attributes such as enrollment date, mile accumulation, usage, recency of mile redemption, total reward points, Lifetime revenue, etc.
• E-mail behaviors such as opens, clicks and purchases (from e-mails sent)
Response variable – Unsubscribed versus still mailable (binary level variable)
CHAID (Chi-square Automatic Interaction Detector) algorithm
Cross validation method
15
About CHAID
A type of decision tree technique
Use of the chi-square test for contingency tables to decide which variables are of maximal importance for classification
Advantages are that its output is highly visual and easy to interpret
Often used as an exploratory technique and is an alternative to multiple regression
16
Output (Partial)
17
% Unsub
Overall
% Unsub
among
people
with # of
opens in
last 60
days=1
Predictors Selected
# of e-mails opened last 60 days
Days since loyalty club enrollment
# of e-mails opened last 30 days
# of Bonus (partner) credits earned YTD
Days since last travel
Days since most recent e-mail opened or clicked
Date of Last earn/ or redemption of flight/ or Bonus (partner) credit
# of e-mails opened last 365 days
# of vouchers redeemed in lifetime
18
Node Gain
Gain Chart on model development sample
19
Node (%) Gain (%) Unsub (%) Index Node (%) Gain (%) Unsub (%) Index6 2.07 5.56 2.14 269 2.1 5.6 2.1 269
13 2.66 6.43 1.93 242 4.7 12.0 2.0 25364 4.46 8.48 1.52 190 9.2 20.5 1.8 22314 3.46 4.39 1.01 127 12.7 24.9 1.6 19657 28.11 33.04 0.94 118 40.8 57.9 1.1 14265 6.51 6.73 0.82 103 47.3 64.6 1.1 1375 8.49 8.19 0.77 96 55.8 72.8 1.0 131
58 12.89 10.82 0.67 84 68.6 83.6 1.0 12260 9.5 7.6 0.64 80 78.1 91.2 0.9 11712 4.23 2.92 0.55 69 82.4 94.2 0.9 11462 4.06 2.05 0.4 50 86.4 96.2 0.9 11159 7.77 2.92 0.3 38 94.2 99.1 0.8 10561 5.8 0.88 0.12 15 100 100 0.8 100
Total 100 100 0.8 100
Node by NodeNodes
Cumulative
Revenue
Top 10% of the members contributed to 67% of total revenue
20
Revenue Rank % of Base % of Revenue
Decile 1 10.8% 67.2%
Decile 2 10.4% 17.0%
Decile 3 10.4% 9.6%
Decile 4 10.6% 5.9%
Decile 5 - 10 57.9% 0.3%
Total 100.0% 100.00%
X-Tab: Node vs. Revenue
21
Each of the top nodes have high revenue producing members
Node Decile 1 Decile 2 Decile 3 Decile 4 Decile 5 - 10 Total6 8.8% 19.2% 21.1% 23.5% 27.4% 100.0%
13 11.8% 12.4% 12.3% 13.4% 50.1% 100.0%64 23.4% 15.0% 6.4% 4.6% 50.6% 100.0%14 15.1% 13.9% 12.2% 13.2% 45.5% 100.0%57 3.8% 2.4% 2.6% 2.8% 88.5% 100.0%65 16.6% 12.0% 5.9% 4.2% 61.3% 100.0%5 24.2% 14.6% 9.8% 10.0% 41.4% 100.0%
58 27.5% 18.9% 16.0% 16.8% 20.8% 100.0%60 7.6% 17.2% 23.7% 25.1% 26.5% 100.0%12 9.3% 10.7% 10.5% 11.7% 57.9% 100.0%62 1.6% 13.7% 32.0% 35.0% 17.7% 100.0%59 11.9% 14.2% 13.4% 12.9% 47.6% 100.0%61 0.0% 0.2% 0.2% 0.4% 99.2% 100.0%
Overall/Total 10.8% 10.4% 10.4% 10.6% 57.9% 100.0%
Revenue
Identifying most profitable flyers
4% (or 120K) of frequent flyers contributed 15% (~$3.1 million) of program revenue
22
Node Decile 1 Decile 2 Decile 3 Decile 4 Decile 5 - 10 Total6 0.2% 0.4% 0.4% 0.5% 0.5% 1.9%
13 0.3% 0.3% 0.3% 0.3% 1.3% 2.6%64 1.0% 0.7% 0.3% 0.2% 2.3% 4.5%14 0.5% 0.5% 0.4% 0.5% 1.6% 3.5%57 1.1% 0.7% 0.7% 0.8% 25.0% 28.2%65 1.1% 0.8% 0.4% 0.3% 3.9% 6.4%5 2.1% 1.2% 0.8% 0.9% 3.5% 8.6%
58 1.6% 1.1% 0.9% 1.0% 1.2% 5.8%60 0.7% 1.7% 2.3% 2.4% 2.6% 9.7%12 0.4% 0.5% 0.4% 0.5% 2.5% 4.3%62 0.1% 0.6% 1.3% 1.4% 0.7% 4.1%59 1.7% 2.1% 2.0% 1.9% 6.9% 14.6%61 0.0% 0.0% 0.0% 0.0% 5.8% 5.9%
Total 10.8% 10.4% 10.4% 10.6% 57.9% 100.0%
Revenue
Node Decile 1 Decile 2 Decile 3 Decile 4 Decile 5 - 10 Total6 0.7% 0.6% 0.4% 0.3% 0.0% 2.0%
13 2.0% 0.5% 0.3% 0.2% 0.0% 3.0%64 6.3% 1.1% 0.3% 0.1% 0.0% 7.8%14 3.0% 0.8% 0.4% 0.3% 0.0% 4.5%57 9.4% 1.1% 0.7% 0.4% 0.0% 11.6%65 6.9% 1.3% 0.4% 0.1% 0.0% 8.6%5 13.9% 2.1% 0.8% 0.5% 0.0% 17.3%
58 10.3% 1.8% 0.8% 0.5% 0.0% 13.5%60 2.8% 2.7% 2.1% 1.4% 0.1% 9.0%12 2.6% 0.7% 0.4% 0.3% 0.0% 4.1%62 0.2% 0.9% 1.2% 0.8% 0.0% 3.1%59 9.1% 3.4% 1.8% 1.0% 0.0% 15.4%61 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Total 67.2% 17.0% 9.6% 5.9% 0.3% 100.0%
Revenue
A risk-revenue matrix
23
Using the output of the model
Now that you know those most likely to unsubscribe
And know who are the most valuable
You can single out these folks and make them an offer that they cannot refuse.
Analytics helps the airline target the right people.
24
How modeling reduced churn
In one year, analytics was used for a wireless phone company –Cingular - to reduce monthly churn by 26% -- Millions of dollars.
25
Identify Best-Customer Look-Alike with Logistic Regression
26
Case Study: US off-price e-tailer
Background
Off-price e-tailer of name-brand apparel and other goods in US
e-Mail is their single largest marketing channel, and their most important retention tool
e-Mail communication delivers 40% of the total revenue
27
What can be measured
Attrition and retention
Migration upward and downward
Incremental sales per program and per season
Frequency of seasonal purchases
Dollars spent per trip and per season
Number of departments shopped per trip and per season.
Number of items shopped per trip and per season–
Share of customers’ wallet
28
Business Problem
About 50% of revenue are actually driven by their loyalty club members• An annual membership fee is required
Size of loyalty club is small – just 1.8% of e-mail base
Client asked:• Who should we focus as the next tier of subscribers amongst
the other ~98% of the e-mail list
• Who look like the best customers I have
• How can we find people who might become best customers if nurtured
29
Objective
Understand what variables describe best customers
Identify likely best customers
Initiate programs to nurture these subscribers, to keep them happy
30
Analysis Background
Random sample of 10% of e-mail subscriber base
Approx 10 predictor variables• Attributes such as # of lifetime purchases, first/most recent
order, e-mail address acquisition source, etc.
• E-mail behaviors such as e-mail tenure, opens, clicks and purchases (from e-mails sent)
Response variable – Loyalty program member vs. non-Loyalty program member (binary level variable)
Logistic Regression
Cross validation method
31
About Logistic Regression
Prediction of the probability of occurrence of an event by fitting data to a logistic curve
Very useful techniques when one wants to understand or to predict the effect of a series of variables on a binary response variable (a variable which can take only two values, 0/1 or Yes/no, for example)
For example, it’s help to anticipate the likelihood of customers responding to a direct mail, or the likelihood a person is about to churn from a subscription
32
Impact of Predictors
Some variables used included:• Total # of purchases
The more the better
• Time on file
The younger the better
• Months since first purchase
The more the better
• Months since last purchase
The less (or more recent) the better
• Total e-mails clicked on over the past year
The more the better
• Total e-mails opened over the past year
The more the better… though not always predictive
33
Parameter BetaLIFETIME_ORDERS 0.105TENURE_MON -0.052MON_SINCE_FIRSTPUR 0.047MON_SINCE_LASTORDER -0.047CLKS 0.01OPNS 0.004
Model Gain
Gain Chart on model development sample
34
Node (%) Gain (%) Best (%) Index Node (%) Gain (%) Best (%) Index
1 10.1 92.5 14.1 783 10.1 92.5 14.1 783
2 10.3 6.4 1.1 61 20.4 98.9 8.7 484
3 9.6 0.5 0.1 6 30.0 99 6.0 331
4 11 0.4 0.1 6 41.0 100 4.4 243
5 10.2 0.2 0 0 51.2 100 3.5 195
6 8.8 0.2 0 0 60.0 100 3.0 167
7 14.3 0 0 0 74.3 100 2.4 135
8 5.9 0 0 0 80.2 100 2.2 125
9 5.4 0 0 0 85.6 100 2.1 117
10 14.4 0 0 0 100 100 1.8 100
Total 100 100 1.8 100
DecileDecile by Decile Cumulative
Now that we know who to target…
The model enables us to focus on those most likely to be interested in the loyalty club.
We can target only those folks with messages and rewards that will get them to join.
We make them offers that we could not afford to offer to everyone.
How the model boosts profits and reduces churn…
35
Model beats random select
A model predicts those subscribers who would be interested in a particular product.
Mailing these 273,334 produces 842 sales and only 273 unsubscribers.
If the model had not been used, there would have been only 41 sales and 3,553 unsubscribers.
Replacing each unsubscriber costs $14.
Without the model, the mailing would have been a disaster.
36
Conclusions
Predictive modeling is just getting started in e-mail marketing.
Reason: e-mails are so inexpensive that the attitude was: “Blast ‘em all!”
We now realize that subscribers are very valuable. We can lose them by random blasting.
Models help us by reducing unsubscribes and also by identifying those subscribers who are most interested in what we have to say.
Predictive modeling works with e-mail marketing.
37
To learn more….
38 Available from Amazon.com or BarnesandNoble.com
Thank you for viewing.
39
For more information, please contact:
Arthur Middleton Hughes, Senior Strategist | 954-767-4558
Anna Lu, Director of Research and Analytics | 781-372-1961