22
DATA SCIENCE FOR BUSINESS Lecture 5. Customer relationship management. Churn prediction. Classification. School of Data Analysis and Artificial Intelligence Department of Computer Science Moscow, May 22 nd , 2020. [email protected]

DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

DATA SCIENCE FOR BUSINESS

Lecture 5. Customer relationship management. Churn prediction. Classification.

School of Data Analysis and Artificial Intelligence Department of Computer Science

Moscow, May 22nd, 2020.

[email protected]

Page 2: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CUSTOMER RELATIONSHIP MANAGEMENTCRM an approach to managing a company's interaction with current and potential customers to improve business relationships with customers, focusing on customer retention and increasing sales

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Company functions• Marketing• Sales• Customer service support

Page 3: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CUSTOMER RELATIONSHIP MANAGEMENTThe service sector consists of the production of services instead of end products

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

• Competitive landscape, easy to switch• High customer acquisition cost• Repetitive interaction - service• Large customer LTV

Service sector industries• Telecommunication• Healthcare• Education• Financial services (banking, insurance, investment)• Professional services (accounting, legal, consulting)• Transportation• Hospitality and tourism• Mass media• Real estate• Utilities

Page 4: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

• Propensity to buy new service or product• Propensity to respond to an offer• Propensity to churn

PROPENSITY MODELING Efficient customer communication and interaction

Page 5: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CLASSIFICATION ALGORITHMS

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Examples of classification algorithms• Logistic regression• Decision trees• KNN k-nearest neighbors• Naïve Bayes• Ensemble methods:• Random forest • Gradient boosting decision trees (xgboost)• Neural networks & DL

Classification algorithms• Binary classification• Multi-class classification

Results:• Class assignments• Probability or score/ranking

Page 6: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

LOGISTIC REGRESSION

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Problem set-up:• Target variable y – binary 0/1 (class) • Predictors (x1,x,2,x3 ) – numerical

Page 7: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

DECISION TREE

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

• Decision node – feature/attribute test and rule (numerical or categorical)• Leaf node – outcome (class in classification)

Problem set-up:• Target variable binary class / multi-class • Predictors (x1,x,2,x3 ) – numerical or categorical

Page 8: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

RANDOM FOREST

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Problem set-up:• Target variable binary class / multi-class • Predictors (x1,x,2,x3 ) – numerical or categorical

• Bootsrap sampling of examples • Subsample of features/predictors per tree• Voting for the result

Page 9: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

MODEL TRAINING AND TESTINGSame rules apply as for regression algorithms

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Page 10: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CLASSIFICATION EVALUATION Quality metrics

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

True positivesTP

FalsePositives

FP

FalseNegatives

FN

True negativesTN

Actual

NegativePositive

Predicted

Yes (+)

No (-)

True positive = Predict event and event happens

True negative = Predict event does not happen, nothing happens

False positive = Predict event and event does not happen (false alarm)

False negative = Fail to predict event that does happen (missed alarm)

Page 11: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CLASSIER EVALUATION Confusion matrix

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

True positivesTP

FalsePositives

FP

FalseNegatives

FN

True negativesTN

Actual

NegativePositive

Predicted

Yes (+)

No (-)

Positive50%

Negative50%

Negative10%

Positive90%

Data set 1 Data set 2

“Majority class” classifier ( classify all to positive)

Page 12: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CLASSIFIER EVALUATION Normalized confusion matrix

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

True positivesratetpr

Falsepositives

ratefpr

Falsenegatives

ratefnr

True negativesratetnr

ActualNegativePositive

Predicted

Yes (+)

No (-)

True positivesTP

FalsePositives

FP

FalseNegatives

FN

True negativesTN

NegativePositive

Predicted

Yes (+)

No (-)

P N

Actual

tpr = TP / P = TP /(TP+FN)fpr = FP / N = FP /(FP+TN)

fnr = FN / P = 1 - tprtnr = TN / N = 1 - fpr

Accuracy = (TP +TN)/ (P+N) = tpr*P/(P+N) + (1-fpr)* N/(P+N)

Page 13: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CLASSIFIER EVALUATION ROC space

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

instance is positive, alg - positive:tpr = TP/P = ‘specificity’ = ‘recall’

(% of positive detected)

instance is negative, alg - positive:fpr = FP/N = ‘false alarm rate’ = 1 – ‘sensitivity’

(% of negatives wrongly reacted to)

P NY 0.4 0.08N 0.6 0.92

P NY 0.9 0.6N 0.1 0.4

P NY 0.6 0.6N 0.4 0.4

no positives

all positives

Page 14: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

RANKING CLASSIFIERSThreshold on ranking

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Page 15: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

RECEIVER OPERATING CURVE (ROC)

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Page 16: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

RECEIVER OPERATING CURVE (ROC)Classifier quality metric – ROC AUC

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

ROC AUC = area under the ROC

curve

ROC AUC = 1 perfectROC AUC =0.5 random (bad)

Page 17: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CUMULATIVE RESPONSE AND LIFT CURVESRanking classifiers

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Lift of a classifier represents the advantage it provides over random guessing.

Cumulative response curves plot the percentage of positives correctly classified (tpr = TP/P) , as a function of the percentage

of the instances in the data

Page 18: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

CUSTOMER CHURN MODELING

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Page 19: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

COST-SENSITIVE LEARNINGCost-benefit analysis

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

p(Y|p) = tprp(Y|n) = fprp(N|p) = fnrp(N|n) = tnr

p(p) = P/(P+N)p(n) = N/(P+N)

Page 20: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

PROFIT CURVESProfit from churn management campaign

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

P NY 0.92 0.14N 0.08 0.86

P N

Y $90 -$10

N $0 $0∑ $44.7

Page 21: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification

ONE MORE BOOK

School of Data Analysis and Artificial Intelligence Department of Computer Science

.

Page 22: DATA SCIENCE FOR BUSINESS - Leonid Zhukov · CLASSIFICATION ALGORITHMS School of Data Analysis and Artificial Intelligence Department of Computer Science. Examples of classification