Upload
franklin-patrick
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
1
Does Credit Score Really Help Explain Insurance Losses?
Cheng-Sheng Peter Wu, FCAS, ASA, MAAA,
Jim Guszcza, ACAS, MAAA, Ph. D.
2
Themes
The History What Does the Question Mean? Simpson’s Paradox - Need for Multivariate
Analysis What Has Been Done So Far? Our Large-Scale Data Mining Experience Going Beyond Credit Conclusions
3
The History
Pricing/Class Plans Few factors before World War II Explosion of class plan factors after the War Current class plans (Auto) – territory, driver,
vehicle, loss and violation, others, tiers/company, etc.
Actuarial techniques – Minimum Bias & GLM
4
The History
Credit First important factor identified over the past 2 decades Composite multivariate score vs. raw credit information Introduced in late 80’s and early 90’s Viewed at first as a “secret weapon” Currently almost everyone is using it Industry scores vs. proprietary scores Quiet, confidential, controversial, black-box, …etc
5
What Does the Question Mean?
Can Credit Score Really “Explain” Ins Losses?“X explains Y”Weaker than claiming that X causes YStronger than merely reporting that X is
correlated with Y
6
What Does the Question Mean?
Working DefinitionWe say that “X helps explain Y” if:
– X is correlated with Y– The correlation does not go away when
other available, measurable information is introduced
7
What Does the Question Mean?
Intuition Behind the Definition It might be okay for X to be a proxy for a “true”
cause of Y– Testosterone level might be a true cause of auto
losses…. But it’s not available– Age/Gender is a reasonable proxy
It might not be okay for X to be a proxy for other available predictive information
8
What Does the Question Mean?
Applying the Definition Suppose we see that credit score plays an
important role in a multivariate regression equation that predicts loss ratio
Then it is fair to say the credit helps explain insurance losses
A multivariate study is needed
9
Simpson’s Paradox – Need for Multivariate Analysis
Statistics can lie Illustrates how a univariate association
can lead to a spurious conclusion The “true” explanatory factor is masked
by the spurious correlationFamous example: 1973 Berkeley
admissions data
10
Simpson’s Paradox – Need for Multivariate Analysis
The Berkeley Example (stylized)2200 people applied for admission1100 men; 1100 women210 men, 120 women were accepted.Clear-cut case of gender discrimination……. Or is it?
11
Simpson’s Paradox – Need for Multivariate Analysis
# Applicants # Accepted % AcceptedArts Eng Total Arts Eng Total Arts Eng Total
Female 1000 100 1100 100 20 120 10% 20% 11%Male 100 1000 1100 10 200 210 10% 20% 19%
12
Simpson’s Paradox – Need for Multivariate Analysis
REGRESSION RESULTSBeta T- Score
Intercept 0.109 10.2
Gender 0.082 5.1
Beta T- ScoreIntercept 0.10 9.20
Gender 0.00 0.00
School 0.10 3.80
13
Simpson’s Paradox – Need for Multivariate Analysis
# Policies # Policies w/Claims FrequencyAdult Youthful Total Adult Youthful Total Adult Youthful Total
Good Credit 1000 100 1100 100 20 120 10% 20% 11%Bad Credit 100 1000 1100 10 200 210 10% 20% 19%
14
What Has Been Done So Far
We (actuaries) have been quiet Few published actuarial studies/opinions
– NAIC/Tillinghast (1997)– Monaghan’s Study (2000)
Recent/related studies– Virginia State Study (1999)– CAS Sub-Committee (2002)– Washington State Study (2003)– University of Texas Study (2003)
15
What Has Been Done So Far
Relevant Actuarial/Statistical Principles Pure premium vs. loss ratio
– Loss ratio studies go beyond existing rating plans, and are implicitly multivariate
Independence vs. correlation – Most insurance variables are correlated
Univariate vs. multivariate– Correlated variables call for multivariate studies for true answers
(Simpson’s Paradox) Credibility vs. homogeneity
– Studies need to be credible and representative
16
What Has Been Done So Far
The Tillinghast Study 9 companies’ data, seems representative Loss ratio study No other predictive variables included in the study No detailed information given about the data Strong correlation with loss ratio, seems credible This is true, but it doesn’t answer our question and
doesn’t quiet the critics
17
What Has Been Done So Far
Tillinghast Study of 9 Companies' Data
Loss Ratio Relativity of the Best and Worst 20% of Credit Score
Co1 Co2 Co3 Co4 Co5 Co6 Co7 Co8 Co9 Avg
Best 20% -38% -29% -19% -15% -14% -34% -22% -22% -36% -25%
Worst 20% 48% 20% 32% 30% 46% 59% 20% 22% 95% 41%
18
What Has Been Done So Far
Monaghan’s Study Loss ratio study Large amount of data – credible analysis Analyze individual credit variables as well as score Multivariate analysis – limited to score + 1 traditional
rating variable at a time Shows strong correlations with loss ratio do not go
away in the presence of other variables Another good step, but we can go further
19
Our Large-Scale Data Mining Experience
Our Work Loss ratio studies Multiple studies - representative Large amounts of data – credible Hundreds of variables tested along with credit – truly
multivariate– Policy, driver, vehicle, coverages, billing, agency, external
data, synthetic, …etc. Sound actuarial and statistical model design Disciplined data mining process
20
Our Large-Scale Data Mining Experience
What Have We Found Out? Credit score is always one of top variables selected for
the multivariate models Credit score has among the strongest parameters and
statistical measurements (t-score)– Credit’s predictive power does not go away in the truly
multivariate context
Removing credit score dampens the predictive power of the models
21
Our Large-Scale Data Mining Experience
What Do We Conclude? We conclude that credit score bears an
unambiguous relationship to insurance losses, and is not a mere proxy for other kinds of information available to insurance companies.
This does not mean that credit score is the “cause” of insurance losses
22
Our Large-Scale Data Mining Experience
Why Is Credit Score Correlated with Ins Losses? Beyond the scope of our work
– Emphasis is not causation
Plausible speculations include– Stress/planning & organization– Risk-seeking behavior– ??
Analogy: Age/Gender might be a proxy for testosterone
23
Going Beyond Credit
Can We Do Well Without Credit? YES: non-credit predictive models are
– Valuable alternative to credit scores– Flexible– Tailored to individual companies– Comparable predictive power to credit scores
Also possible to build mixed credit/non-credit models
24
Going Beyond Credit
Keys to Building Successful Non-Credit Models: Fully utilize all sources of information
– Leverage company’s internal data sources– Enriched with other external data sources
Use large amount of data Employ disciplined analytical process Utilize state-of-the-art modeling tools Apply multivariate methodology
25
Going Beyond Credit
Advantages of Going Beyond Credit Next generation of competitive advantage More variables, more predictive power Leverages company’s internal data sources More flexibility Address regulatory issues and public concerns Expense savings Everyone gets a score (less of a “no hit” problem) More customized – less “plain vanilla” than credit score
26
Conclusions
Credit works… even in a fully multivariate setting But non-credit models can work well too! What it means to us – beginning of a new era
– Advances in computer technology– Advances in predictive modeling techniques– Large scale multivariate studies now practical– More external and internal info, anything else out there?– Other ways to go beyond credit?
27
Conclusions
Future works on this topic Multivariate pure premium analysis would
provide more insights Further study of public policy issues
– WA, VA came to opposite conclusions
Comparison of various existing scoring models