View
7
Download
3
Category
Preview:
DESCRIPTION
Citation preview
© Deloitte Consulting, 2004
Predictive Modeling for Property-Casualty
Insurance
James Guszcza, FCAS, MAAAPeter Wu, FCAS, MAAA
SoCal Actuarial ClubLAX
September 22, 2004
2
© Deloitte Consulting, 2004
Predictive Modeling: 3 Levels of Discussion
StrategyProfitable growthRetain most profitable policyholders
MethodologyModel design (actuarial)Modeling process
TechniqueGLM vs. decision trees vs. neural nets…
3
© Deloitte Consulting, 2004
Methodology vs Technique
How does data mining need actuarial science?
Variable creationModel designModel evaluation
How does actuarial science need data mining?
Advances in computing, modeling techniquesIdeas from other fields can be applied to insurance
problems
4
© Deloitte Consulting, 2004
Semantics: DM vs PM
One connotation: Data Mining (DM) is about knowledge discovery in large industrial databases
Data exploration techniques (some brute force)
e.g. discover strength of credit variables Predictive Modeling (PM) applies statistical
techniques (like regression) after knowledge discovery phase is completed.
Quantify & synthesize relationships found during knowledge discovery
e.g. build a credit model
© Deloitte Consulting, 2004
Strategy: Why do Data Mining?
Think Baseball!
6
© Deloitte Consulting, 2004
Bay Area Baseball
In 1999 Billy Beane (manager for the Oakland Athletics) found a novel use of data mining.
Not a wealthy teamRanked 12th (out of 14) in payroll How to compete with rich teams?
Beane hired a statistics whiz to analyze statistics advocated by baseball guru Bill James
Beane was able to hire excellent players undervalued by the market.
A year after Beane took over, the A’s ranked 2nd!
7
© Deloitte Consulting, 2004
Implication
Beane quantified how well a player would do. Not perfectly, just better than his peers
Implication: Be on the lookout for fields where an expert is
required to reach a decision based on judgmentally synthesizing quantifiable information across many dimensions.
(sound like insurance underwriting?) Maybe a predictive model can beat the pro.
8
© Deloitte Consulting, 2004
Example
Who is worse?... And by how much?20 y.o. driver with 1 minor violation who pays his bills
on time and was written by your best agentMature driver with a recent accident and has paid his
bills late a few times
Unlike the human, the algorithm knows how much weight to give each dimension…
Classic PM strategy: build underwriting models to achieve profitable growth.
9
© Deloitte Consulting, 2004
Keeping Score
Billy BeaneCEO who wants to run the next Progressive
Beane’s Scouts Underwriter
Potential Team Member Potential Insured
Bill James’ statsPredictive variables – old or new (e.g. credit)
Billy Bean’s number cruncher
You! (or people on your team)
© Deloitte Consulting, 2004
What is Predictive Modeling?
11
© Deloitte Consulting, 2004
Three Concepts
Scoring enginesA “predictive model” by any other name…
Lift curvesHow much worse than average are the policies with
the worst scores?
Out-of-sample testsHow well will the model work in the real world?Unbiased estimate of predictive power
12
© Deloitte Consulting, 2004
Classic Application: Scoring Engines
Scoring engine: formula that classifies or separates policies (or risks, accounts, agents…) into
profitable vs. unprofitableRetaining vs. non-retaining…
(Non-)Linear equation f( ) of several predictive variables
Produces continuous range of scores
score = f(X1, X2, …, XN)
13
© Deloitte Consulting, 2004
What “Powers” a Scoring Engine?
Scoring Engine:
score = f(X1, X2, …, XN)
The X1, X2,…, XN are as important as the f( )!Why actuarial expertise is necessary
A large part of the modeling process consists of variable creation and selection
Usually possible to generate 100’s of variablesSteepest part of the learning curve
14
© Deloitte Consulting, 2004
Model Evaluation: Lift Curves
Sort data by score Break the dataset into
10 equal pieces Best “decile”: lowest
score lowest LR Worst “decile”: highest
score highest LR Difference: “Lift”
Lift = segmentation power
Lift ROI of the modeling project
15
© Deloitte Consulting, 2004
Out-of-Sample Testing
Randomly divide data into 3 pieces Training data, Test data, Validation data
Use Training data to fit models Score the Test data to create a lift curve
Perform the train/test steps iteratively until you have a model you’re happy with
During this iterative phase, validation data is set aside in a “lock box”
Once model has been finalized, score the Validation data and produce a lift curve
Unbiased estimate of future performance
16
© Deloitte Consulting, 2004
Comparison of Techniques
Models built to detect whether an email message is really spam.
“Gains charts” from several models
Analogous to lift curves Good for binary target
All techniques work ok! Good variable creation at
least as important as modeling technique.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Perc.Total
Pe
rc.F
rau
d
perfect modelmarsneural netdecision treeglmregression
Spam Email Detection - Gains Charts
17
© Deloitte Consulting, 2004
Credit Scoring is an Example
All of these concepts apply to Credit ScoringKnowledge discovery in databases (KDD)Scoring engineLift Curve evaluation translates to LR
improvement ROIBlind-test validation
Credit scoring has been the insurance industry’s segue into data mining
18
© Deloitte Consulting, 2004
Applications Beyond Credit The classic: Profitability Scoring Model
Underwriting/Pricing applications Retention models Elasticity models Cross-sell models Lifetime Value models Agent/agency monitoring Target marketing Fraud detection Customer segmentation
no target variable (“unsupervised learning”)
19
© Deloitte Consulting, 2004
Data Sources Company’s internal data
Policy-level records Loss & premium transactions Agent database Billing VIN……..
Externally purchased data Credit CLUE MVR Census ….
© Deloitte Consulting, 2004
The Predictive Modeling Process
Early: Variable Creation
Middle: Data Exploration & Modeling
Late: Analysis & Implementation
21
© Deloitte Consulting, 2004
Variable Creation
Research possible data sources Extract/purchase data Check data for quality (QA)
Messy! (still deep in the mines)
Create Predictive and Target VariablesOpportunity to quantify tribal wisdom…and come up with new ideasCan be a very big task!
Steepest part of the learning curve
22
© Deloitte Consulting, 2004
Types of Predictive Variables
Behavioral Historical Claim, billing, credit …
Policyholder Age/Gender, # employees …
Policy specifics Vehicle age, Construction Type …
Territorial Census, Weather …
23
© Deloitte Consulting, 2004
Data Exploration & Variable Transformation
1-way analyses of predictive variables Exploratory Data Analysis (EDA) Data Visualization Use EDA to cap / transform predictive
variablesExtreme valuesMissing values…etc
24
© Deloitte Consulting, 2004
Multivariate Modeling Examine correlations among the variables Weed out redundant, weak, poorly distributed
variables Model design Build candidate models
Regression/GLMDecision Trees/MARSNeural Networks
Select final model
25
© Deloitte Consulting, 2004
Building the Model1. Pair down collection of predictive variables
to a manageable set
2. Iterative process Build candidate models on “training data” Evaluate on “test data” Many things to tweak
Different target variables Different predictive variables Different modeling techniques # NN nodes, hidden layers; tree splitting rules…
26
© Deloitte Consulting, 2004
Considerations Do signs/magnitudes of parameters make
sense? Statistically significant? Is the model biased for/against certain types
of policies? States? Policy sizes? ... Predictive power holds up for large policies? Continuity
Are there small changes in input values that might produce large swings in scores
Make sure that an agent can’t game the system
27
© Deloitte Consulting, 2004
Model Analysis & Implementation Perform model analytics
Necessary for client to gain comfort with the model
Calibrate ModelsCreate user-friendly “scale” – client dictates
Implement modelsProgramming skills are critical here
Monitor performanceDistribution of scores over time, predictiveness,
usage of model...Plan model maintenance
© Deloitte Consulting, 2004
Modeling Techniques
Where Actuarial Science Needs Data Mining
29
© Deloitte Consulting, 2004
The Greatest Hits
Unsupervised: no target variable
ClusteringPrincipal Components (dimension reduction)
Supervised: predict a target variable
Regression GLMNeural NetworksMARS: Multivariate Adaptive Regression SplinesCART: Classification And Regression Trees
30
© Deloitte Consulting, 2004
Regression and its Relations
GLM: relax regression’s distributional assumptions
Logistic regression (binary target)Poisson regression (count target)
MARS & NNClever ways of automatically transforming and
interacting input variablesWhy: sometimes “true” relationships aren’t linearUniversal approximators: model any functional form
CART is simplified MARS
31
© Deloitte Consulting, 2004
Neural Net Motivation
Let X1, X2, X3 be three predictive variablespolicy age, historical LR, driver age
Let Y be the target variableLoss ratio
A NNET model is a complicated, non-linear, function φ such that:
φ(X1, X2, X3) ≈ Y
32
© Deloitte Consulting, 2004
In visual terms…
1
X1
X3
X2
Z1
Z2
Y
1a11
a12
a21
a31
a321
b1
b2
a22
a01
a02
b0
33
© Deloitte Consulting, 2004
NNET lingo
Green: “input layer” Red: “hidden layer” Yellow: “output layer” The {a, b} numbers are
“weights” to be estimated.
The network architecture and the weights constitute the model.
1
X1
X3
X2
Z1
Z2
Y
1a11
a12
a21
a31
a321
b1
b2
a22
a01
a02
b0
34
© Deloitte Consulting, 2004
In more detail…
221101
1zbzbbe
Y
1
X1
X3
X2
Z1
Z2
Y
1a11
a12
a21
a31
a321
b1
b2
a22
a01
a02
b0
331221111011
11 xbxbxbaeZ
332222112021
12 xbxbxbaeZ
35
© Deloitte Consulting, 2004
In more detail…
The NNET model results from substituting the expressions for Z1
and Z2 in the expression for Y.
221101
1zbzbbe
Y
1
X1
X3
X2
Z1
Z2
Y
1a11
a12
a21
a31
a321
b1
b2
a22
a01
a02
b0
331221111011
11 xbxbxbaeZ
332222112021
12 xbxbxbaeZ
36
© Deloitte Consulting, 2004
In more detail…
Notice that the expression for Y has the form of a logistic regression.
Similarly with Z1, Z2.
221101
1zbzbbe
Y
1
X1
X3
X2
Z1
Z2
Y
1a11
a12
a21
a31
a321
b1
b2
a22
a01
a02
b0
331221111011
11 xbxbxbaeZ
332222112021
12 xbxbxbaeZ
37
© Deloitte Consulting, 2004
In more detail…
You can therefore think of a NNET as a set of logistic regressions embedded in another logistic regression.
221101
1zbzbbe
Y
1
X1
X3
X2
Z1
Z2
Y
1a11
a12
a21
a31
a321
b1
b2
a22
a01
a02
b0
331221111011
11 xbxbxbaeZ
332222112021
12 xbxbxbaeZ
38
© Deloitte Consulting, 2004
Universal Approximators
The essential idea: by layering several logistic regressions in this way…
…we can model any functional form no matter how many non-linearities or
interactions between variables X1, X2,… by varying # of nodes and training cycles only
NNETs are sometimes called “universal function approximators”.
39
© Deloitte Consulting, 2004
MARS / CART Motivation
NNETs use the logistic function to combine variables and automatically model any functional form
MARS uses an analogous clever idea to do the same work
MARS “basis functions” CART can be viewed as simplified MARS
Basis functions are horizontal step functions
NNETS, MARS, and CART are all cousins of classic regression analysis
40
© Deloitte Consulting, 2004
Reference
For Beginners:Data Mining Techniques
--Michael Berry & Gordon Linhoff
For Mavens:The Elements of Statistical Learning
--Jerome Friedman, Trevor Hastie, Robert Tibshirani
Recommended