View
214
Download
0
Embed Size (px)
Citation preview
Applications of Resampling Methods in Actuarial Practice
by
Dr. Richard Derrig
Automobile Insurers Bureau of Massachusetts
Dr. Krzysztof Ostaszewski
Illinois State University
Dr. Grzegorz Rempala
University of Louisville
CAS Annual MeetingWashington, DC
November 12-15, 2000
Actuarial Modeling Processes
Distributions of variables determined Parametric, with parameters estimated
from data Monte Carlo simulation of variables Cash flow testing, sensitivity analysis and
profit testing Integrated in the company DFA model
What can go wrong?
Ignoring uncertainties: Maybe common in early models Hardly the case in modern methodologies
Overfitting: the data must submit to prescribed distributions May work in practice but not in theory There is nothing more impractical than the wrong
theory
Loss Distributions
Data clustered around certain values Data truncated from below or censored
from above Mixtures of distributions possible Data may simply not fit the desired
parametric distribution
The Concept of Bootstrap
Random sample of size n from an unknown distribution F.
Create empirical distribution Generate an IID random sequence
(resample) from empirical distribution Use it to estimate parameters or
characteristics of the original distribution
Overview of this Work
Basics of bootstrap (including estimating standard errors and confidence intervals)
Apply bootstrap to two empirical data sets Compare bootstrap to traditional estimates Smoothing bootstrap estimate Clustered data, data censoring, inflation
adjustment
Plug-in Principle
Given a parameter of interest depending on CDF F, estimate it by replacing F by its empirical counterpart obtained from the observed data.
This is referred to as the bootstrap estimate of the parameter
Bootstrap Methodology
Efron (1979) Bickel and Freedman (1981): conditions for
consistency, quantile processes, multiple regression, and stratified sampling
Singh (1981): for many statistics bootstrap is asymptotically equivalent to the one-term Edgeworth expansion
Boostrap SE estimate (Efron)
seB
ˆ * (b) ˆ *(.)B 1
2
b1
B
Bootstrap Standard Error Estimate
Rarely practical to calculate standard errordirectly
Instead approximate with multiple resamples
Efron’s BESE, by the Law of Large Numbers , approximates the theoretical standard error in the limit
Should take about 250 resamples
The Method of Percentiles
Bootstrap estimate of , let G* be the distribution function
Bootstrap percentiles method uses inverse images of and 1-under G* as the bounds for the confidence intervals
In practice, these bounds are taken from multiple resamples, empirical percentiles
Application to Wind Losses: Quantiles
Hogg & Klugman (1984): data on 40 losses due to wind-related catastrophes in 1977
Standard approach to confidence intervals: normal approximation to the sample quantiles
Hogg and Klugman obtain: (9,32) Using the bootstrap method of percentiles
we obtain the interval (8,27), considerably shorter
Smoothed Boostrap: Excess Losses
Estimate the probability that wind loss will exceed a $29,500,000 threshold, i.e., 1 -F(29.5). Plug in: 1 - F(29.5) = 0.05.
But relative frequency is constant on an interval containing 29.5, and data is rounded off.
Hogg and Klugman use MLE to fit truncated exponential and truncated Pareto distributions.
Solid line: exponential, Dashed line: Pareto
Smoothed Boostrap using the three term moving average smoother
Clustered Data: Massachusetts Auto Bodily Injury Liability Data
432 closed losses, bodily injury liability in Boston territory for 1995, as of mid-1997
Policy limits capped 16 out of 432 losses, data is right censored
Overwhelming presence of suspected fraud and buildup claims. This causes some numerical values to have unusually high frequencies.
Clustered Data: Massachusetts Auto Bodily Injury Liability Data
No Injury Type
Total AmtPaid
PolicyLimit
71 Strain/Sprain 3,500 50,000
72 Other 3,500 20,00073 Strain/Sprain 3,650 20,00074-76 Strain/Sprain 3,700 20,00077-80 Strain/Sprain 3,750 20,00081 Strain/Sprain 3,900 20,00082-91 Strain/Sprain 4,000 20,000
Approximation to empirical CDF adjusted for clustering. Also zoomed at
(3.5, 5)
Bootstrap Estimates for Loss Elimination Ratio
Standard approach to reinsurance purchase: loss elimination ratio
Can use plug-in bootstrap estimate (empirical loss elimination ratio)
Better: smoothed empirical loss elimination ratio
Result in the following figure:
SELER
Policy Limits and Deductibles. Bootstrapping Censored Data
We use Kaplan-Meier estimator Can be viewed as a generalization of usual
empirical CDF adjusted for the fact of censoring losses.
Next figure shows Kaplan-Meier vs. SELER, first censoring point at 20
Kaplan-Meier estimator
ˆ S (x) n i
n i 1
i :xi x
i
Kaplan-Meier vs. SELER
Some Conclusions
These ideas can be extended to all modeled variables
They should be extended Most interesting for interest rates and capital
assets in general Time series and dependence of variables
most challenging Long Tails may be problematic