Haga clic para modificar el estilo de texto del patrón A ...actuaries.org/EVENTS/Congresses/Cancun/ica2002... · 3 An overview of the insurance market Presence of inadequate tools

Haga clic para m odificar el estilo de título del patrón

Haga clic para m odificar el estilo de texto del patrón

Segundo nivel

Tercer nivel

Cuarto nivel

Q uinto nivel

Presentation Date Consultant's Name 1

A Discussion of Modeling Techniquesfor Personal Lines Pricing

Cristina Mano, BrazilElena Rasa, Italy

2

Agenda

� Overview of the insurance m arket

� Pricing approach

� Overview of the m ethodologies: GLM , Decision Trees andNN

� Com parison am ong the three techniques

� Conclusions

3

An overview of the insurance m arket

� Presence of inadequate tools for acom plete and effective analysis of the client

� Poor knowledge of clients

� Som e lines of business are running at aloss in m ost of the countries (i.e. m otorbusiness)

� Low level of sophistication

� Cross-subsidies am ong clients

4

Critical areas

� The risk increases with:

� high exposure on insurance business;

� ultim ate cost not com pletely recognized ;

� inadequate selection of underwritten risks ;

� not effective claim m anagem ent ;

� low cross-selling level ;

� not enough em phasis on client.

� The necessary elem ents for reaching the objective “client” are:

� econom ic cost estim ation;

� lapse probability of the insured and his elasticity of dem and.

� Com petitive M arket Analysis

5

The profitability varies with the risk profile

Contribution to thecum m ulative valueof the portfolio

Good clientsDeveloping new services

singling out possible

preservation strategies

To actively manageIm proving the profitability of

existing products

Offering value added services

Increasing num ber of

products for each client

(cross-selling)

“Up or out?”Is it possible to increase the

value?

�Decreasing the costs and

expenses?

�Changing products?

W hat clients should we

exclude?

0%

20%

40%

60%

80%

100%

120%

140%

160%

180%

200%

0% 20 40 60 80 100%

Percentage of clients

A

B

C

6

Nowadays, too m uch attention is paid to the cancellation of “bad”risk

… it could be better to find the right strategies so that:

� the prem ium be correctly calibrated on the risk;

� the econom ic cost of the client be projected;

� new “ad hoc” products be created;

� the portfolio be segm ented in order to define the m arketniches, which destroy value.

7

Pricing Strategy

Com petitorsand strategy

Internal/

External

bonds

Datacollection

Agents/Brokers

InsuredsTerritorialAnalysis

InternationalBenchm ark

Portfolioanalysis

Risk M odeling

Econom ic Cost

Com petitorAnalysis

Com m ercialTariff

+ Im pactAnalysis

our analysis:particular attention tothe “risk m odeling”

8

Risk M odelling: Overview of M ethodologies

� New m ultivariate techniques available which are replacingolder m ethodologies;

� These techniques represent advances in:

� quantifying the TRUE econom ic cost of writing eachpolicy;

� m easuring the econom ic im pact of adopting any rate planother than the actuarial m odel;

� understanding lapse and renewal experience;

� m arket pricing behaviour;

� in short, m anaging the business.

9

W hy M ultivariate Statistical Techniques?

� M ost rating variables are correlated;

� Different variables m ay be showing the sam e underlyingeffect;

� Repeated use of univariate techniques leads to double-counting of the sam e effects;

� They can capture interactions;

� They provide m ore than a point estim ate and standard errors.

10

18 19 20 21 22 23 24 25 260

1

2

3

4

5

0

1

2

3

4

5

… looking for m ore suitable techniques

GLM : Generalized LinearM odels

2 3 4

1Decision Trees

CHAID

CART

Geographic Analysis

Cluster Analysis

GIS

Neural Nets Σ

11

W hat is GLM ?

� It is a statistical procedure for m easuring the effect of oneor m ore independent variables upon a dependent variable;

� Dependent variable for ratem aking are:

� frequency

� severity

� pure prem ium

� GLM allows extrem e flexibility in m odel design:

� m ultiplicative, additive or m ixed plans

� different error distributions (i.e. Norm al, Gam m a, etc.)

� variable interactions (i.e. sex & age)

12

An exam ple of G eneralised Linear M odel

This statistical approach allows us to determ ine the cross-subsides am ong theclients and to create a theoretical rating structure, which penalizes bad clients andfavors good clients

Horse Power

0.00Up to 8 9 - 10 10 - 12 12 - 14 14 - 16 16 - 18 18 - 20 20+

0.50

1.00

1.50

2.00

2.50

Coe

ffici

ents

Actual coefficients

Theoretical coefficients

13

� Procedures for successively subdividing data intohom ogeneous groups;

� Like GLM s, they use a dependent variable and one or m oreindependent ones;

� Results are not necessarily sym m etric;

� Im plicitly capture the natural interactions between factors;

� Produces hom ogeneous groups (i.e., a tree structure), but norating plan or relativities;

� Possible m ethodologies, m ost fam ous:

� CHAID: Chi-Square Autom ated Inform ation Detection

� CART: Classification and Regression Tree

W hat are decision trees?

14

A decision tree is given by a set of decisional rules for predicting a fixed dependentvariable (for exam ple, claim severity, frequency or lapse rate)

Decision Trees: “Divide et im pera”

All Custom ers

12.93%

abde

17.08%

cgh

14.64%

Age G roups

Location

f

11.73%

ij

8.51%

Location Case Size

k

5.34%

acd

15.83%

heg

8.46%

h

22.93%

fi

12.06%

bdefi

9.11%

M arital Status

acgh

18.22%

s

8.30%

m l

14.49%

acg

12.67%

bdeh

4.61%

fi

8.04%

sm

3.61%

l

8.48%

s

25.00%

m

18.93%

l

6.86%

>l

12.14%

ab

13.03%

cd

20.68%

abcde

11.43%

fshi

16.61%

m

26.13%

f

14.85%

No. of Policies O riginal Term Location

Location Case Size

Average lapse propensity is 13%

Statistical analysis identifies custom ertypes with lapse propensities of 4%to 26%

Gender

15

Decision tree: an exam ple

1) Are you in Bonus classes?2) Do you live in one of the provinces of zone A

Gruppo A: AG, AP, AR, BA, BI, BN, BS,BZ, CB, CL, CO, CR, CZ, FI, FO, GE,GO, IM, LI, LO, LT, MC, MI, MO, NA,PA, PC, PE, PI, PO, PR, PZ, RA, RE, RI,RN, SA, SI, SP, TE, TS, TV, UD, VB,VE, VR, VT

Is there anyonewith age

between 18 to25 ?

Is it a woman?

Does anyonelive in city A2?

Does anyonelive in city A1?

Is there vehiclewith less than9 HP or morethan 18 HP?

Gruppo A2: AG, AP, AR, BI, BN, BZ,CO, CR, CZ, FI, GE, GO, IM, LO, LT,MC, NA, PC, PE, PI, PO, RI, RN, SA,SI, TE, TS, TV, VE, VT

Gruppo A1: AG, AP, AR, BA, BI, BN,BS, BZ, CB, CL, CO, CR, CZ, FI, FO,GE, GO, IM, LI, LO, LT, MI, MO, PA,PC, PE, PI, PO, PR, PZ, RA, RE, RI,RN, SI, SP, TE, TS, TV, UD, VB, VE,VR, VT

Risky client

Average client

Good client

16

� Neural networks: are non-linear predictive m odels that learnhow to detect a pattern in order to m atch a particular profilethrough a training process;

� It’s not necessary to split the pure prem ium into its com ponents:

� frequency and

� severity;

� Advantages and disadvantages:

W hat are Neural Networks?

+ Usable even when relationships am ong variables are unknown

+ W ill m odel non-linearity and interaction well

- The solutions can not be interpreted (“Black box”?)

- Can take trem endous com puting power and still not converge to a solution

17

Neural Networks are m otivated by a sim plified m odel of biologicalneurones in the brain

BIOLOGICAL M OTIVATIO N: PYRAM IDAL CELL

AND THEIR M OST SIM PLIFIED M ATHEM ATICAL ANALOGY ( E.G. TW O IN COM M ONLY-USED TYPES OF )

SIGM OID FUNCTION NEURON ( -> M LP ) RADIAL BASIS FUNCTION NEURON ( -> RBF )

Synapses

Dendrites

Axon

Cell body

Synapses

Σinputs weights sum activity

functionoutput

Σinputs weights sum radial basis

functionoutput

18

CONNECTED TO A NETW ORK

input layer hidden layer output layer

y

BUILD AN ARTIFICIAL NEURAL NETW ORK

input values output value(s)

x1

xN

.

.

Σ

Σ

Neural Networks (contd.)

19

Our analysis...

� A sim ulated portfolio was created using the distributions of theItalian m arket (public data - 1999) relative to the m ain ratingparam eters:

Bonus/M alus

Fuel

Territorial zones

Num ber ofinstallm ents

Horse power

Sex/Age

Claim s lim it

20

W hat about the dependent variables?

� Claim frequency

� Severity

� Pure prem ium

Param etric analysis

Non-param etric analysis

21

Choosing the rating param eters

� The selection of inappropriate variables can spoil the final result;

� Including a variable, which does not contribute in any way to thefinal result, could have the effect of dim inishing the m odelperform ance;

� This danger is very high in NN, m oderate in G LM , totally indifferentin CART/CHAID.

Age of policyholder

Vehicle Group

Vehicle Age

Use of snow-board

22

“Over-fitting” or “over-param eterization” danger

� The “over-fitting” (or “over-param eterisation”) concept is alwaysthere, whatever the statistical m ethod used;

� Using a too m any variables in the estim ation process, m ay lead to:

� m em orising also the idiosyncrasy of the training set (in thelanguage of neural nets),

� incom plete separation of the stochastic part from thedeterm inistic one (in the param etric language);

� In G LM there are specific statistics that report the “over-param eterisation” phenom enon;

� In decision trees analysis, it is necessary to reach m axim um depth(challenging the “over-fitting” danger), in order to proceed to thepruning step until the best tree is defined.

23

M anaging m issing values

� There are different m ethods for the m issing values m anagem ent:

� dropping the record with a m issing value - G LM ,

� substituting the m issing value with characteristic or typical values(average, quantiles, closest neighbor,… ) - CART, NN,

� estim ating the m issing value, after having assigned a fixed level(‘Errors’, 9999) - m anipulating the data,

� building separate m odels for each set of m issing values -m anipulating the data,

� using the non-coded values in the learning phase of the network -NN;

� CART and NN are very robust m ethods in m anaging the m issingvalues. The correct way to proceed is not clear…

24

M anaging anom alous values or outliers

� In order to identify every anom alous value, it is a good idea to startwith a “data m ining” phase using the following:

� residuals plots,

� over-dispersion analysis,

� Cook and Leverage statistics;

� O nce the outliers have been identified, it is possible:

� to assign a low m arginal probability,

� to drop the observation from the data set,

� to confine them in a separate class and to m ake an estim ate in asuccessive phase;

� Non-param etric techniques are m ore robust then the param etric onesin m anaging the anom alous values and outliers.

25

Is a Neural Net a “Black box”?

� If such a term refers to a presentational or a synthesis of theresults problem s, the answer is YES;

� if it refers to the arithm etic of the algorithm , the answer is NO or,at least, not m ore then other statistical techniques, includingGLM and Decision trees;

� In fact, working with a Neural Net:

� it is difficult to know which are the im portant variables to beincluded in the m odel and how they interact am ongthem selves; and,

� there is no structure of coefficients (relativities), as it is for theparam etric regression and there is not a final m odel;

Black box = low synthesis, low presentational power

26

Com puting tim e

� A few years ago, neural networks were used alm ostexclusively for “pattern recognition” problem s, m ainly due tothe long com puting tim e required;

� A sim ilar problem was true also for decision trees m ethods.CHAID, in particular, based on a contingency table where aCHI-squared is perform ed on each cell, could be very heavyfrom a com puting point of view;

� W ith the advent of new and m ore powerful com puters, eventhese techniques can be used in the solution of “everyday”problem s.

27

Reading and interpreting the results

� G LM

� The results are directly com parable with the rating coefficientsapplied by the insurance com panies. The reading is easy forspecialists;

� CART/CHAID

� It is a m ethod that com m unicates through im ages. The resultsare always in the form of an upside-down tree. The reading isvery easy also for lay people;

� NN

� It gives an estim ate close to the real observation of the database in the training and testing set. O nce the training set hasbeen m em orized, the network can be used in another sam plewhere the observation is m issing. The reading is not very easyeven by specialists.

28

Im plem entation

� G LM

� Fast and easy. The coefficient structure reproduces the ratingstructure of the com pany and it is directly com parable andeasy to im plem ent;

� CART/CHAID

� The result is very easy and readable. It is com posed by alim ited num ber of nodes and to each of them an averageprem ium is associated. The question is, is it acceptable tohave a rating structure consisting of 49-50 profiles?

� NN

� It is very useful as discrim inate analysis, but it needs a greatdeal of m odification in order to be im plem ented.

29

A com parison am ong the three techniques: a final report card

Possible but onerous/negative answer

Not applicable/indifferent

Good results/positive answer

30

Selection of rating param eters

“O ver-fitting” danger

M issing values m anagem ent

O utlier values m anagem ent

Black box

Com puting tim e

Reading and understandingresults

Im plem entation

O verall assessm ent

GLM NN CART

A com parison am ong the three techniques: a final report card

Haga clic para m odificar el estilo de título del patrón

Haga clic para m odificar el estilo de texto del patrón

Segundo nivel

Tercer nivel

Cuarto nivel

Q uinto nivel

Presentation Date Consultant's Name 31

� It’s im portant to understand the ideas behind the varioustechniques, in order to know how and when to use them ;

� It’s im portant to accurately assess the perform ance of a m ethod,to know how well it can be expected to work (… sim pler m ethodsoften perform as well as com plex ones!);

� in data m ining, understanding the system used is not always acrucial problem . A neural network that produces optim al estim atescan be preferable to easier but less efficient m odels;

� This is an exciting research area, that has applications in science,industry, finance, etc.

Pearls of wisdom ...

Documents

Haga clic para modificar el estilo de texto del patrón A ...actuaries.org/EVENTS/Congresses/Cancun/ica2002... · 3 An overview of the insurance market Presence of inadequate tools