21
May, 2015 Bui Van Hong Email: [email protected]

churn prediction in telecom

Embed Size (px)

Citation preview

Page 1: churn prediction in telecom

May, 2015

Bui Van Hong

Email: [email protected]

Page 2: churn prediction in telecom

Agenda

Churn prediction in prepaid mobile telecommunication network

Machine Learning

Introduction customer churn

Diagram of possible customer states

Churn prediction Model

Classification accuracy

Machine learning algorithm

Support vector machine

Nearest neighbour machine

Multilayer percenptron neural network machine

Decision tree machine

Native bayes machine

Feature presentation & Demo

List of attributes

Build Training & testing data

Demo

Page 3: churn prediction in telecom

3

Machine Learning

Supervised learning

Supervised learning is the machine learning task of inferring a function from labeled

training data.

unsupervised learning

Supervised learning is the machine learning task of inferring a function from unlabeled

training data.

Semi-supervised learning falls between unsupervised learning (without any labeled

training data)

and supervised learning (with completely labeled training data).

Reinforcement learning

Reinforcement learning is an area of machine learning inspired by behaviorist

psychology

(chess player)

Deep learning

Deep learning (deep machine learning, or deep structured learning, or hierarchical

learning, or sometimes DL) is a branch of machine learning based on a set of

algorithms that attempt to model high-level abstractions in data by using model

architectures, with complex structures or otherwise, composed of multiple non-linear

transformations

Page 4: churn prediction in telecom

4

Introduction

- Customer churn is define as the lose of customers

- Churn in prepaid service is actually measure base on the

lack of activity in the network over a period time, thus

there is no formal notification from customer of ending a

contract term, our goal is to infer when this lack of

activity may happen in the future for each active

customer

- Churn prediction can be viewed as supervised

classification problem where the behavior of previously

know churners an non-churners are used to train a binary

classified

Page 5: churn prediction in telecom

5

Diagram of possible customer states

Each customer can be in one of the following states:

New, active, inactive or churn

Page 6: churn prediction in telecom

6

Churn prediction Model

Training data

Testing data

Real Model Feature extraction

Requires Expert Knowledge in Telco

industry

Learning algorithm & Prediction

Independent

(1) Build

(2) Test

(3) Test=OK

(4) Predict

(5) Update

ETL 1

ETL 2

Page 7: churn prediction in telecom

7

Classification Accuracy

Tiêu chí đánh giá độ chính xác của kết quả

Actual class/Predictive class Churned Non-churned

Churned True Positive (TP) False Negative (FN)

Non-churned False Positive (FP) True Negative(TN)

Accuracy=𝑇𝑃+𝑇𝑁

𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁

Churn rate =𝑇𝑃

𝑇𝑃+𝐹𝑃

Page 8: churn prediction in telecom

Agenda

Churn prediction in prepaid mobile telecommunication network

Introduction

Diagram of possible customer states

Churn prediction Model

Classification accuracy

Machine learning algorithm

Support vector machine

Nearest neighbor machine

Multilayer perceptron neural network machine

Decision tree machine

Native Bayes machine

Feature presentation & Demo

List of attributes

Demo-POC

Page 9: churn prediction in telecom

9

Support Vector Machine Learning Non-linear/Linear SVM

Linear

Functions

( ) Tg b x w x

Nonlinear

Functions

=-1 =+1

Page 10: churn prediction in telecom

10

Nearest Neighbor Learning k-Nearest Neighbor

Nearest

Neighbor

=-1 =+1

Page 11: churn prediction in telecom

11

Multilayer Perceptron (MLP) Neural Network

Learing MLP algorithm

1

2

3

x1

x2

=-1

=+1

Page 12: churn prediction in telecom

12

Decision Tree Learning Decision Tree algorithm

Decision

Tree

=-1 =+1

Y:churn N: non-churn

Page 13: churn prediction in telecom

13

Naïve Bayes Learning Naïve Bayes algorithm

Naïve Bayes classification

– Assumption that all input features are conditionally independent!

– MAP classification rule: for

)|()|()|(

)|,,()|(

)|,,(),,,|()|,,,(

21

21

22121

CXPCXPCXP

CXXPCXP

CXXPCXXXPCXXXP

n

n

nnn

Lnn ccccccPcxPcxPcPcxPcxP ,, , ),()]|()|([)()]|()|([ 1*

1***

1

),,,( 21 nxxx x

Page 14: churn prediction in telecom

Agenda

Churn prediction in prepaid mobile telecommunication network

Introduction

Diagram of possible customer states

Churn prediction Model

Classification accuracy

Machine learning algorithm

Support vector machine

Nearest neighbor machine

Multilayer perceptron neural network machine

Decision tree machine

Native Bayes machine

Feature presentation & Demo

List of attributes

Logical design

Demo

Evaluation & compartion

Page 15: churn prediction in telecom

15

List of attribute

Attribute Name Description

Number Of Outgoing calls Number Of Outgoing calls

Number Of Messages Sent Number Of Messages Sent

Usage GPRS Subscriber is provided with GPRS service (Yes, No)

Total MOU Minutes of usage ( MoU) per subscriber

Total Number Of Recharges Total Number Of Recharges

Total Credit Revenue Credit Revenue Per subscriber (Voice, SMS, GPRS, VAS)

Total Bonus Revenue Bonus Revenue Per subscriber (Voice, SMS, GPRS, VAS)

Customer Relationship Age classifies measures pertaining to Customers according to the number of years for which the Customer has had a relationship with the Telecommunications Services Provider.

Remaining credit Remaining credit per subscriber

Churn Churning customer status (yes, No)

Page 16: churn prediction in telecom

16

Logical Design

Input Description

Data BI system

Feature Domain expert in industry

Algorithm SVM, NB, Decision tree, MLP, k-Nearest Neighbour

Output Description

Churn/ no-churn Specify churn/no-churn for each subscribers

Page 17: churn prediction in telecom

17

Demo Sampling for training data

Attribute Description

NUM_OG_CALLS Number of outgoing call for Month -1

SUM_DURATION_OG Minutes of usage ( MoU) per subscriber

NUM_SMO Number Of Messages Sent for Month -1

NUM_DATA_UP Subscriber is provided with GPRS service (Yes, No) for Month -1

NUM_OG_CALLS_1 Number of outgoing call for Month -2

SUM_DURATION_OG_1 Minutes of usage ( MoU) per subscriber for Month -2

NUM_SMO_1 Number Of Messages Sent for Month -2

NUM_DATA_UP_1 Subscriber is provided with GPRS service (Yes, No) for Month -2

Recharge Number of rechard times for Month -1

Recharge_1 Number of rechard times for Month -2

RemainCredit Remain credit in Month -3

RemainCredit_1 Remain credit in Month -2

RemainCredit_2 Remain credit in Month -1

TOTAL_BONUS Revenue bonus of Month -2

TOTAL_CREDIT Revenue credit of Month -2

TOTAL_BONUS_1 Revenue bonus of Month -1

TOTAL_CREDIT_1 Revenue credit of Month -1

CLASS {1,0} Chun/no-churn

Total records: 2876 rows

Page 18: churn prediction in telecom

18

Demo Sampling for Input data

Attribute Description

NUM_OG_CALLS Number of outgoing call for Month -1

SUM_DURATION_OG Minutes of usage ( MoU) per subscriber

NUM_SMO Number Of Messages Sent for Month -1

NUM_DATA_UP Subscriber is provided with GPRS service (Yes, No) for Month -1

NUM_OG_CALLS_1 Number of outgoing call for Month -2

SUM_DURATION_OG_1 Minutes of usage ( MoU) per subscriber for Month -2

NUM_SMO_1 Number Of Messages Sent for Month -2

NUM_DATA_UP_1 Subscriber is provided with GPRS service (Yes, No) for Month -2

Recharge Number of rechard times for Month -1

Recharge_1 Number of rechard times for Month -2

RemainCredit Remain credit in Month -3

RemainCredit_1 Remain credit in Month -2

RemainCredit_2 Remain credit in Month -1

TOTAL_BONUS Revenue bonus of Month -2

TOTAL_CREDIT Revenue credit of Month -2

TOTAL_BONUS_1 Revenue bonus of Month -1

TOTAL_CREDIT_1 Revenue credit of Month -1

CLASS {1,0} ?

Total records: 15175 rows

Page 19: churn prediction in telecom

19

Demo Sampling for Input data

Attribute Description

NUM_OG_CALLS Number of outgoing call for Month -1

SUM_DURATION_OG Minutes of usage ( MoU) per subscriber

NUM_SMO Number Of Messages Sent for Month -1

NUM_DATA_UP Subscriber is provided with GPRS service (Yes, No) for Month -1

NUM_OG_CALLS_1 Number of outgoing call for Month -2

SUM_DURATION_OG_1 Minutes of usage ( MoU) per subscriber for Month -2

NUM_SMO_1 Number Of Messages Sent for Month -2

NUM_DATA_UP_1 Subscriber is provided with GPRS service (Yes, No) for Month -2

Recharge Number of rechard times for Month -1

Recharge_1 Number of rechard times for Month -2

RemainCredit Remain credit in Month -3

RemainCredit_1 Remain credit in Month -2

RemainCredit_2 Remain credit in Month -1

TOTAL_BONUS Revenue bonus of Month -2

TOTAL_CREDIT Revenue credit of Month -2

TOTAL_BONUS_1 Revenue bonus of Month -1

TOTAL_CREDIT_1 Revenue credit of Month -1

CLASS {1,0} ?

Total records: 15175 rows

Page 20: churn prediction in telecom

20

Demo Result

0%

10%

20%

30%

40%

50%

60%

70%

80%

Accuracy

Churn rate

Jan-15

NaiveBayes MultilayerPerceptron Nearest neighbour Decision tree SVM

Prediction actual-Churn

actual-non-Churn Prediction

actual-Churn

actual-non-Churn Prediction

actual-Churn

actual-non-Churn Prediction

actual-Churn

actual-non-Churn Prediction

actual-Churn

actual-non-Churn

Churn 7353 2955 4398 6550 2956 3594 5980 2906 3074 6137 2956 3181 3161 2497 664

non-Churn 7823 3596 4227 8626 3595 5031 9196 3645 5551 9039 3595 5444 12015 4054 7961

Accuracy 47% 53% 56% 55% 69%

Churn rate 45% 45% 44% 45% 38%

Page 21: churn prediction in telecom

Page 21