Applications of Resampling Methods in Actuarial Practice by Dr. Richard Derrig Automobile Insurers Bureau of Massachusetts Dr. Krzysztof Ostaszewski Illinois

Applications of Resampling Methods in Actuarial Practice

by

Dr. Richard Derrig

Automobile Insurers Bureau of Massachusetts

Dr. Krzysztof Ostaszewski

Illinois State University

Dr. Grzegorz Rempala

University of Louisville

CAS Annual MeetingWashington, DC

November 12-15, 2000

Actuarial Modeling Processes

Distributions of variables determined Parametric, with parameters estimated

from data Monte Carlo simulation of variables Cash flow testing, sensitivity analysis and

profit testing Integrated in the company DFA model

What can go wrong?

Ignoring uncertainties: Maybe common in early models Hardly the case in modern methodologies

Overfitting: the data must submit to prescribed distributions May work in practice but not in theory There is nothing more impractical than the wrong

theory

Loss Distributions

Data clustered around certain values Data truncated from below or censored

from above Mixtures of distributions possible Data may simply not fit the desired

parametric distribution

The Concept of Bootstrap

Random sample of size n from an unknown distribution F.

Create empirical distribution Generate an IID random sequence

(resample) from empirical distribution Use it to estimate parameters or

characteristics of the original distribution

Overview of this Work

Basics of bootstrap (including estimating standard errors and confidence intervals)

Apply bootstrap to two empirical data sets Compare bootstrap to traditional estimates Smoothing bootstrap estimate Clustered data, data censoring, inflation

adjustment

Plug-in Principle

Given a parameter of interest depending on CDF F, estimate it by replacing F by its empirical counterpart obtained from the observed data.

This is referred to as the bootstrap estimate of the parameter

Bootstrap Methodology

Efron (1979) Bickel and Freedman (1981): conditions for

consistency, quantile processes, multiple regression, and stratified sampling

Singh (1981): for many statistics bootstrap is asymptotically equivalent to the one-term Edgeworth expansion

Boostrap SE estimate (Efron)

seB

ˆ * (b) ˆ *(.)B 1

2

b1

B

Bootstrap Standard Error Estimate

Rarely practical to calculate standard errordirectly

Instead approximate with multiple resamples

Efron’s BESE, by the Law of Large Numbers , approximates the theoretical standard error in the limit

Should take about 250 resamples

The Method of Percentiles

Bootstrap estimate of , let G* be the distribution function

Bootstrap percentiles method uses inverse images of and 1-under G* as the bounds for the confidence intervals

In practice, these bounds are taken from multiple resamples, empirical percentiles

Application to Wind Losses: Quantiles

Hogg & Klugman (1984): data on 40 losses due to wind-related catastrophes in 1977

Standard approach to confidence intervals: normal approximation to the sample quantiles

Hogg and Klugman obtain: (9,32) Using the bootstrap method of percentiles

we obtain the interval (8,27), considerably shorter

Smoothed Boostrap: Excess Losses

Estimate the probability that wind loss will exceed a $29,500,000 threshold, i.e., 1 -F(29.5). Plug in: 1 - F(29.5) = 0.05.

But relative frequency is constant on an interval containing 29.5, and data is rounded off.

Hogg and Klugman use MLE to fit truncated exponential and truncated Pareto distributions.

Solid line: exponential, Dashed line: Pareto

Smoothed Boostrap using the three term moving average smoother

Clustered Data: Massachusetts Auto Bodily Injury Liability Data

432 closed losses, bodily injury liability in Boston territory for 1995, as of mid-1997

Policy limits capped 16 out of 432 losses, data is right censored

Overwhelming presence of suspected fraud and buildup claims. This causes some numerical values to have unusually high frequencies.

Clustered Data: Massachusetts Auto Bodily Injury Liability Data

No Injury Type

Total AmtPaid

PolicyLimit

71 Strain/Sprain 3,500 50,000

72 Other 3,500 20,00073 Strain/Sprain 3,650 20,00074-76 Strain/Sprain 3,700 20,00077-80 Strain/Sprain 3,750 20,00081 Strain/Sprain 3,900 20,00082-91 Strain/Sprain 4,000 20,000

Approximation to empirical CDF adjusted for clustering. Also zoomed at

(3.5, 5)

Bootstrap Estimates for Loss Elimination Ratio

Standard approach to reinsurance purchase: loss elimination ratio

Can use plug-in bootstrap estimate (empirical loss elimination ratio)

Better: smoothed empirical loss elimination ratio

Result in the following figure:

SELER

Policy Limits and Deductibles. Bootstrapping Censored Data

We use Kaplan-Meier estimator Can be viewed as a generalization of usual

empirical CDF adjusted for the fact of censoring losses.

Next figure shows Kaplan-Meier vs. SELER, first censoring point at 20

Kaplan-Meier estimator

ˆ S (x) n i

n i 1

i :xi x

i

Kaplan-Meier vs. SELER

Some Conclusions

These ideas can be extended to all modeled variables

They should be extended Most interesting for interest rates and capital

assets in general Time series and dependence of variables

most challenging Long Tails may be problematic

Documents

Applications of Resampling Methods in Actuarial Practice by Dr. Richard Derrig Automobile Insurers Bureau of Massachusetts Dr. Krzysztof Ostaszewski Illinois