Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Rockford Mutual Insurance Client Project
Report Prepared For:
Mr. Shane A. Heeren, Vice President of Marketing and Sales, Rockford Mutual Insurance
Report Prepared By:
Brent Brown, Karina Rodriguez, Margaret Sottosanto, Dan Warfield
Introduction to Marketing Analytics, Pui Ying Tong, Illinois State University
December 1st, 2017
Table of Contents
Introduction………………………………………………………………………………………..x
Data preparation steps……………………………………………………………………………..x
Descriptive Statistics and Visualizations of the Data
Set…………………………………………x
Customer retention/churn, Loss analysis………………………………………………………….x
Nearest Neighbor (KNN) Prediction……………………………………………………………...x
Discussion…………………………………………………………………………………………x
Appendix………………………………………………………….……………………………….x
Introduction
The purpose of this study is to display how Rockford Mutual Insurance is performing according
to their historical data. This information can help our client to make better business decisions in
the future. Our research objectives were to find customers that Rockford Mutual Insurance
should target based on meaningful insights from the auto policyholder dataset which included
variables such as age, insurance score, marital status, etc. We were able to group these customers
and profile them into categories such as top five best worst customer segments. Through this we
have provided insights for ideal customers as well as “loss” customers. We provided analysis on
customer retention as well as churn.
Data Preparation Steps
The dataset that was given to us was structured so that each row was a transaction. The policies
are renewed every 6 months and another dataset was structured so that every unique customer
represented each customer- this measured their total profits and total loss amounts during 2012-
2104. In the customer-based dataset, we created new fields and variables. One that indicated who
churned and who did not during this time period and one measuring who left in less than a year.
We added two new customers in which we added numbers in the variables with someone in
mind. We also added a new variable and we named it “Target”, it was 0’s all the way down the
row except for the new customers that we created, which was a 1.
Recency, Frequency, and Monetary (RFM) analysis is a technique used to quantitatively
determine what customers are the most valuable. A higher RFM score represents a higher quality
of customer. The RFM bar chart shows us that males and females ages 30 through 64 are
Rockford Mutual’s best customers.
Descriptive Statistics and Visualizations of the Data Set
Churn Rate Based on Territory
We see that the top churn territories include 17, 50, 87, 90, 92 and the low churn territories
include 1, 14, 57, 81. However, some areas (90 & 50) with the highest churn rates produce
higher profits for the firm. It is advised that Rockford Mutual focuses less on territories 17, 87,
and 92 as these areas have a very high churn rate with relatively low profit levels.
Between married females and males, females age 50-64 bring in the most total profit. Between
single females and males, females age 30-49 bring in the most total profit. The greatest total loss
paid was from married females between the age of 50-64 and from single females 50-64.
Between married females and males the highest average total profit came from males age 19-21.
Between married females and males the highest average total loss paid came from males age 30-
49 Between single females and males the highest average total profit came from females ages
19-21. Between single females and males the highest average total loss profit came from males
age 16-18.
Analyses for “Insurance Score Level” segmentations
The four charts above explain the insurance score levels in term of whether a female was married
or single and whether a male was married or single. All four shows the customer’s age in
relations to the insurance score.
Churn Customers versus Stay Customers
The data in this demonstrates that there is a high correlation between churn and insurance score.
This also shows the sex and marital status, it also shows that no matter if married- pleasure use
describes segments with the highest churn and insurance score.
Customers who left in less than a year
In the chart above, the data on the left demonstrates who left in less than a year. The pink bars
are females and the blue bars are males. The majority of people who stayed less than a year were
males age 19-21.
The Best/Ideal customers segment
The best customer segment are single males and females ages 30-64.
The worst customers segment
The worst customer segment would be married males and females age 16-18.
Nearest Neighbor (KNN) Prediction
Explanation of Nearest Neighbor Analysis and how it can be used to predict target outcome
Predicting Churn
The Nearest Neighbor Analysis is used to find the point in a given set that is the closest or most
similar to a given point. The closer the points, the smaller the function values. Nearest Neighbor
Analysis can be used to predict target outcome by choosing specific variables and running the
data which will present a chart in which the target outcomes will be colored red.
Demonstration of KNN using scenarios
New Customer 1: Age: 19-21, Marital Status: Single, Insurance score:654, Rate plan:7
Outcome for churn would be single females between the ages of 16-18 as shown with the red dot
in the image above.
New Customer 2: Age: 30-49, Marital Status: Married, Insurance score: 700, Rate plan: 5
Customer retention/churn, Loss analysis
Churn
In an assessment of the decision tree we found that the terminal node with the highest churn rate
is from node 107. These customers have attributes that resemble the highest rate plan possible
above 21. They also have made some sort of claim within the last 3 years and their insurance
score is lower than 683. This node contains 829 customers.
Less than a year
In an assessment of the decision tree we found that the terminal node with the highest rate of
clients that were insured under Rockford for less than one year was found in node 53. These
customers have a high rate plan (greater than 18) with low driver violation points (between 0 and
8). This node contains 1007 customers. In regards to long-term benefits, Rockford Mutual should
avoid customers with these attributes as they do stay with the firm for very long.
Total Loss Paid
In an assessment of the decision tree we found that the terminal node with the highest
amount paid out by Rockford Mutual was node 14. These customers are single and
aged 16-24. This tells Rockford that young, inexperienced drivers are more likely to
submit a claim resulting in Rockford paying. There are 2120 customers in this node.
Logistic Regression
Less than a year churn (0/1) prediction
Score dF Sig.
Step 0 Variables PolicyNum 45.707 1 .000
EffectiveDate 1312.224 1 .000
VehNum 123.466 1 .000
Territory 32.929 1 .000
VehicleAge 6.367 1 .012
BISymbol 8.462 1 .004
MPSymbol 18.058 1 .000
CompSymbol 7.225 1 .007
CollSymbol 6.173 1 .013
Use 16.579 4 .002
Use(1) 2.160 1 .142
Use(2) .128 1 .720
Use(3) 8.586 1 .003
Use(4) 6.961 1 .008
AssignedDriverNumber 2.531 1 .112
DriverSex(1) 5.391 1 .020
DriverAgeRange 208.081 7 .000
DriverAgeRange(1) 35.957 1 .000
DriverAgeRange(2) 8.688 1 .003
DriverAgeRange(3) 19.026 1 .000
DriverAgeRange(4) 24.211 1 .000
DriverAgeRange(5) 48.356 1 .000
DriverAgeRange(6) 38.264 1 .000
DriverAgeRange(7) 44.612 1 .000
DriverMarital(1) 31.832 1 .000
DriverViolationPoints 183.044 1 .000
AutoandHomeDiscount 86.404 2 .000
AutoandHomeDiscount(1) 5.030 1 .025
AutoandHomeDiscount(2) 80.039 1 .000
InsuranceScore 252.992 1 .000
InsuranceScoreLevel 345.063 15 .000
InsuranceScoreLevel(1) 51.026 1 .000
InsuranceScoreLevel(2) 8.979 1 .003
InsuranceScoreLevel(3) 8.064 1 .005
InsuranceScoreLevel(4) 11.924 1 .001
InsuranceScoreLevel(5) 10.274 1 .001
InsuranceScoreLevel(6) 3.913 1 .048
InsuranceScoreLevel(7) 1.491 1 .222
InsuranceScoreLevel(8) .068 1 .794
InsuranceScoreLevel(9) 1.126 1 .289
InsuranceScoreLevel(10) 17.198 1 .000
InsuranceScoreLevel(11) 9.275 1 .002
InsuranceScoreLevel(12) 45.488 1 .000
InsuranceScoreLevel(13) 27.992 1 .000
InsuranceScoreLevel(14) 69.676 1 .000
InsuranceScoreLevel(15) 58.263 1 .000
RatePlan 379.980 1 .000
TimesRenewed 598.166 1 .000
LossFreeYears 469.979 1 .000
EarnedPremium 6.131 1 .013
LossIncurred .470 1 .493
Profit .735 1 .391
LossPaid 1.597 1 .206
LossReserve .218 1 .641
SubrogationSalvage .006 1 .941
Expense .011 1 .916
Date_most_recent 1312.224 1 .000
Transaction_count 717.539 1 .000
TotalProfit 46.913 1 .000
Recency_score 1010.091 1 .000
Frequency_score 381.032 1 .000
Monetary_score 9.382 1 .002
RFM_score 1136.174 1 .000
TotalExpense .213 1 .645
TotalLossPaid .813 1 .367
churn 1451.099 1 .000
Churn Regression
B S.E. Wald df Sig. Exp(B)
Step
1a
PolicyNum .000 .000 .009 1 .926 1.000
EffectiveDate .000 .000 3.472 1 .062 1.000
VehNum -.314 7.570 .002 1 .967 .731
Territory .017 .188 .008 1 .930 1.017
VehicleAge .014 1.528 .000 1 .992 1.015
BISymbol .156 1.050 .022 1 .882 1.169
MPSymbol -.026 .612 .002 1 .966 .974
CompSymbol -.508 2.208 .053 1 .818 .602
CollSymbol .670 2.526 .070 1 .791 1.954
Use .067 4 .999
Use(1) 10.724 138.816 .006 1 .938 45448.445
Use(2) -26.542 5494.202 .000 1 .996 .000
Use(3) 3.873 19.584 .039 1 .843 48.106
Use(4) -7.219 98.828 .005 1 .942 .001
AssignedDriverNumb
er
-1.657 10.143 .027 1 .870 .191
DriverSex(1) 3.921 21.524 .033 1 .855 50.446
DriverAgeRange .153 7 1.000
DriverAgeRange(1) 7.334 209.557 .001 1 .972 1531.432
DriverAgeRange(2) 11.248 59.717 .035 1 .851 76737.908
DriverAgeRange(3) 14.362 79.335 .033 1 .856 1727190.743
DriverAgeRange(4) -1.820 64.900 .001 1 .978 .162
DriverAgeRange(5) 14.089 44.692 .099 1 .753 1314487.340
DriverAgeRange(6) 9.891 42.888 .053 1 .818 19750.103
DriverAgeRange(7) 13.454 45.273 .088 1 .766 696388.045
DriverMarital(1) -.013 16.415 .000 1 .999 .987
DriverViolationPoints 1.016 15.023 .005 1 .946 2.762
AutoandHomeDiscou
nt
.098 2 .952
AutoandHomeDiscou
nt(1)
5.427 27.802 .038 1 .845 227.451
AutoandHomeDiscou
nt(2)
-2.301 20.040 .013 1 .909 .100
InsuranceScore .067 .588 .013 1 .909 1.069
InsuranceScoreLevel .135 15 1.000
InsuranceScoreLevel(
1)
83.918 50681.202 .000 1 .999 2.786E+36
InsuranceScoreLevel(
2)
96.146 50681.068 .000 1 .998 5.696E+41
InsuranceScoreLevel(
3)
91.851 50680.974 .000 1 .999 7.768E+39
InsuranceScoreLevel(
4)
22.308 50683.542 .000 1 1.000 4877870255.0
00
InsuranceScoreLevel(
5)
94.693 50680.840 .000 1 .999 1.333E+41
InsuranceScoreLevel(
6)
98.031 50680.814 .000 1 .998 3.752E+42
InsuranceScoreLevel(
7)
99.889 50680.758 .000 1 .998 2.405E+43
InsuranceScoreLevel(
8)
106.423 50680.662 .000 1 .998 1.656E+46
InsuranceScoreLevel(
9)
96.340 50680.608 .000 1 .998 6.920E+41
InsuranceScoreLevel(
10)
109.591 50680.556 .000 1 .998 3.933E+47
InsuranceScoreLevel(
11)
109.404 50680.528 .000 1 .998 3.261E+47
InsuranceScoreLevel(
12)
114.280 50680.519 .000 1 .998 4.278E+49
InsuranceScoreLevel(
13)
109.313 50680.861 .000 1 .998 2.978E+47
InsuranceScoreLevel(
14)
117.613 50683.793 .000 1 .998 1.198E+51
InsuranceScoreLevel(
15)
155.272 50689.595 .000 1 .998 2.716E+67
RatePlan -.411 6.622 .004 1 .951 .663
TimesRenewed .170 1.250 .019 1 .892 1.186
LossFreeYears -.070 2.339 .001 1 .976 .932
EarnedPremium -.003 .052 .004 1 .950 .997
LossIncurred .002 .112 .000 1 .987 1.002
LossPaid -.001 .110 .000 1 .993 .999
LossReserve -.017 .114 .022 1 .883 .983
Expense .192 2.369 .007 1 .935 1.212
Transaction_count -.447 3.158 .020 1 .887 .640
TotalProfit .002 .008 .041 1 .840 1.002
Recency_score 115.619 195.026 .351 1 .553 1.631E+50
Frequency_score -.226 11.738 .000 1 .985 .798
Monetary_score .046 11.745 .000 1 .997 1.047
TotalExpense -.027 .382 .005 1 .943 .973
TotalLossPaid .002 .013 .021 1 .886 1.002
Constant 1427670.77
0
767849.34
0
3.457 1 .063 .
Discussion
Our overall conclusion to the client based on our interpretation of the data analysis results is
insurance score and churn have a high correlation. Recommendations to the client based on our
interpretation of the data analysis results would be to focus your advertisements and marketing
efforts on the territories and demographic with low churn and high profitability so that you are
not wasting your marketing dollars on people who are not very likely to buy. Also, Rockford
Mutual’s ideal customers lie within the 30 to 64 years of age range. Strengths of this study
included our team working together and learning how to use real data to interpret real insights for
this
Limitations of this study include Tableau not being cooperative. Suggestions for future research
would be to include more lifestyle variables (such as smoking) so that any margin of error will
decrease.