20
Churn Prediction in Mobile Social Games: Towards a Complete Assessment Using Survival Ensembles 1 África Periáñez, Alain Saas, Anna Guitart and Colin Magne IEEE/ACM DSAA 2016 Montreal, October 19th, 2016

DSAA 2016 Churn Prediction in Mobile Social Games

Embed Size (px)

Citation preview

Page 1: DSAA 2016 Churn Prediction in Mobile Social Games

Churn Prediction inMobile Social Games: Towards a Complete Assessment Using Survival Ensembles

1

África Periáñez, Alain Saas, Anna Guitart and Colin MagneIEEE/ACM DSAA 2016Montreal, October 19th, 2016

Page 2: DSAA 2016 Churn Prediction in Mobile Social Games

About us

2

Who are we?● Game and technology company based in Tokyo (spin-off of

Silicon Graphics)

● Research project to provide Game Data Science as a Service

● Goals: predict player behavior, scale to big data and intuitive result visualization

Page 3: DSAA 2016 Churn Prediction in Mobile Social Games

3

● Free-to-play mobile social games● in-app purchases and activity behavioral data

Our data

Page 4: DSAA 2016 Churn Prediction in Mobile Social Games

4

Churn prediction in Free-To-Play games

We focus on the top spenders: the whales ➔ 0.2% of the players, 50 % of the revenues➔ Their high engagement make them more likely to answer positively to

action taken to retain them➔ For this group, we can define churn as 10 days of inactivity

◆ The definition of churn in F2P games is not straightforward

Page 5: DSAA 2016 Churn Prediction in Mobile Social Games

Features selection

◎ Game independent features:

○ player attention: time spent per day, lifetime

○ player loyalty : number of days connecting, loyalty index (number of days played over lifetime), days from registration to first purchase, days since last purchase

○ player intensity: number of actions, sessions, amount in-app purchases, action activity distance (total average actions compared to last days behaviour)

○ player level: concept common to most games)

◎ Game dependent features researched but ultimately not part of our model:

○ participation in a guild (social feature)

○ actions measured by categories

5

Page 6: DSAA 2016 Churn Prediction in Mobile Social Games

The modelSurvival Ensembles

6

Page 7: DSAA 2016 Churn Prediction in Mobile Social Games

Challenge: modeling churn

◎ Survival analysis focuses on predicting the time-to-event, e.g. churn○ when a player will stop playing?

◎ Classical methods, like regressions, are appropriate when all players have left the game

◎ Censoring Problem: dataset with incomplete churning information

◎ Censoring is the nature of churn

➔ Survival analysis is used in biology and medicine to deal with this problem

➔ Ensemble learning techniques provide high-class prediction results

7

Page 8: DSAA 2016 Churn Prediction in Mobile Social Games

◎ We focus on whales◎ Cumulative survival probability (Kaplan-Meier estimates) ◎ Step function that changes every time that a player churns

8

Output of the model

Page 9: DSAA 2016 Churn Prediction in Mobile Social Games

◎ Two approaches:○ Churn as a binary classification○ Churn as a censored data problem

◎ One model: Conditional Inference Survival Ensembles1 ○ deals with censoring ○ high accuracy due to ensemble learning

Survival Analysis

➔ Survival analysis methods (e.g. Cox regression) does not follow any particular statistical distribution: fitted from data

➔ Fixed link between output and features: efforts to model selection and evaluation

1) Hothorn et al., 2006. Unbiased recursive partitioning: A conditional inference framework 9

Challenge: modeling churn

Page 10: DSAA 2016 Churn Prediction in Mobile Social Games

Survival Tree➔ Split the feature space

recursively

➔ Based on survival statistical criterion the root node is divided in two daughter nodes

➔ Maximize the survival difference between nodes

➔ A single tree produces instability predictions

Conditional Survival Ensembles➔ Outstanding predictions

➔ Make use of hundreds of trees ➔ Conditional inference survival

ensemble use a Kaplan-Meier function as splitting criterion

➔ Overfit is not present

➔ Robust information about variable importance

➔ Not biased approach10

Conditional inference survival ensembles

Page 11: DSAA 2016 Churn Prediction in Mobile Social Games

Conditional inference survival tree partition with Kaplan-Meier estimates of the survival time which characterizes the players placed in every terminal node group

11

Linear rank statistics as splitting criterion

Survival tree

Page 12: DSAA 2016 Churn Prediction in Mobile Social Games

◎ Two steps algorithm:

○ 1) the optimal split variable is selected: association between covariates and response

○ 2) the optimal split point is determined by comparing two sample linear statistics for all possible partitions of the split variable

Random Survival Forest

➔ RSF is based on original random forest algorithm1

➔ RSF favors variables with many possible split points over variables with fewer

121) Breiman L. 2001. Random Forests.

Conditional inference survival ensembles

Page 13: DSAA 2016 Churn Prediction in Mobile Social Games

The ResultsWith “Age of Ishtaria” Game Data

13

Page 14: DSAA 2016 Churn Prediction in Mobile Social Games

14

Binary classification results and comparison with other models

Page 15: DSAA 2016 Churn Prediction in Mobile Social Games

15

Predicted Kaplan-Meier survival curves as a function of time (days) for new or existing players

Censored data problem results

Page 16: DSAA 2016 Churn Prediction in Mobile Social Games

16

Validation -- Churn prediction

Page 17: DSAA 2016 Churn Prediction in Mobile Social Games

17

Validation -- Churn prediction

1000 bootstrap cross-validation error curves for the survival ensemble model and Cox regression

Page 18: DSAA 2016 Churn Prediction in Mobile Social Games

◎ Censoring problem is the right approach○ the median survival time, i.e. time when the percentage of

surviving in the game is 50%, can be used as a time threshold to categorize a player in the risk of churning

◎ Binary problem -- static model○ also bring relevant information○ useful insight for a short-term prediction

◎ SVM, ANN, Decision Trees, etc. are useful tools for regression or classification problems.○ in their original form cannot handle with censored data○ 1) modification of algorithm or 2) transformation of the data

18

Survival ensembles approach

Page 19: DSAA 2016 Churn Prediction in Mobile Social Games

◎ Application of state-of-the-art algorithm “conditional inference survival ensembles” ○ to predict churn ○ and survival probability of players in social games

◎ Model able to make predictions every day in operational environment

◎ adapts to other game data: Democratize Game Data Science

◎ relevant information about whales behaviour ○ discovering new playing patterns as a function of time○ classifying gamers by risk factors of survival experience

◎ Step towards the challenging goal of the comprehensive understanding of players

19

Summary and conclusion

Page 20: DSAA 2016 Churn Prediction in Mobile Social Games

20

Other work of the authors related to Game Data Science

Discovering Playing Patterns:Time Series Clustering of Free-To-Play Game DataAlain Saas, Anna Guitart and África PeriáñezIEEE CIG 2016

Special Session on Game Data ScienceChaired by Alain Saas and África PeriáñezIEEE/ACM DSAA 2016www.gamedatascience.org