Upload
roberta-west
View
217
Download
1
Embed Size (px)
Citation preview
A Primer on the Exponential Family of Distributions
A Primer on the Exponential Family of Distributions
David Clark & Charles Thayer
American Re-Insurance
GLM Call Paper - 2004
2
AgendaAgenda
• Brief Introduction to GLM
• Overview of the Exponential Family
• Some Specific Distributions
• Suggestions for Insurance Applications
3
Context for GLMContext for GLM
Linear Regression
Generalized Linear Models
Maximum Likelihood
XYE ][
Y~ Normal
XhYE ][
Y ~ Exponential Family
,][ XhYE
Y ~ Any Distribution
4
Advantages over Linear Regression
Advantages over Linear Regression
• Instead of linear combination of covariates, we can use a function of a linear combination of covariates
• Response variable stays in original units
• Great flexibility in variance structure
5
Transforming the Response versus
Transforming the Covariates
Transforming the Response versus
Transforming the Covariates
Linear Regression GLM
E[g(y)] = X· E[y] = g-1(X·)
Note that if g(y)=ln(y), then Linear Regression cannot handle any points where y0.
6
Advantages of this Special Case of Maximum LikelihoodAdvantages of this Special
Case of Maximum Likelihood
• Pre-programmed in many software packages
• Direct calculation of standard errors of key parameters
• Convenient separation of Mean parameter from “nuisance” parameters
7
Advantages of this Special Case of Maximum LikelihoodAdvantages of this Special
Case of Maximum Likelihood• GLM useful when theory immature,
but experience gives clues about:How mean response affected by
external influences, covariates
How variability relates to mean
Independence of observations
Skewness/symmetry of response distribution
8
General Form of the Exponential FamilyGeneral Form of the Exponential Family
iiiiii yhgyedyf exp ;
Note that yi can be transformed with any function e().
9
“Natural” Form of the Exponential Family
“Natural” Form of the Exponential Family
,exp , ; iiii
ii yca
byyf
Note that yi is no longer within a function. That is, e(yi)=yi.
10
Specific Members of the Exponential Family
Specific Members of the Exponential Family
• Normal (Gaussian)
• Poisson
• Negative Binomial
• Gamma
• Inverse Gaussian
11
Some Other Members of the Exponential Family
Some Other Members of the Exponential Family
• Natural FormBinomialLogarithmicCompound Poisson/Gamma (Tweedie)
• General Form [use ln(y) instead of y]LognormalSingle Parameter Pareto
12
Normal DistributionNormal Distribution
Natural Form:
2ln
2
12/exp)(
22 y
yyf
The dispersion parameter, , is replaced with 2 in the more familiar form of the Normal Distribution.
13
Poisson DistributionPoisson Distribution
Natural Form:
))!/ln((
ln)ln(exp)(Prob
yyy
yY
“Over-dispersed” Poisson allows 1.
Variance/Mean ratio =
14
Negative Binomial DistributionNegative Binomial Distribution
Natural Form:
/
1ln/lnlnexp)(Prob
)(
yk
k
ky
kyY
yk
The parameter k must be selected by the user of the model.
15
Gamma DistributionGamma Distribution
Natural Form:
)(ln)ln()1()ln(exp)(
y
yyf
Constant Coefficient of Variation (CV):
CV = -1/2
16
Inverse Gaussian DistributionInverse Gaussian Distribution
Natural Form:
3
22ln
2
111
2exp)( y
y
yyf
17
Table of Variance FunctionsTable of Variance Functions
Distribution Variance Function
Normal Var(y) = Poisson Var(y) = ·
Negative Binomial Var(y) = ·+(/k)·2
Gamma Var(y) = ·2
Inverse Gaussian Var(y) = ·3
18
The Unit Variance FunctionThe Unit Variance Function
We define the “Unit Variance” function as
V() = Var(y) / a()
That is, =1 in the previous table.
19
Uniqueness PropertyUniqueness Property
The unit variance function V() uniquely identifies its parent distribution type within the natural exponential family.
f(y) V()
20
Table of Skewness CoefficientsTable of Skewness Coefficients
Distribution Skewness
Normal 0
Poisson CV
Negative Binomial [1+/(+k)]·CV
Gamma 2·CV
Inverse Gaussian 3·CV
21
Graph of Skewness versus CVGraph of Skewness versus CV
0
1
2
3
4
5
6
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Coefficient of Variation (CV)
Co
effi
cien
t o
f S
kew
nes
s
NegativeBinomial
LogNormal
InverseGaussianGamma
Poisson
Normal
22
The Big Question:The Big Question:
What should the variance function look like for insurance applications?
23
What is the Response Variable?What is the Response Variable?
• Number of Claims
• Frequency (# claims per unit of exposure)
• Severity
• Aggregate Loss Dollars
• Loss Ratio (Aggregate Loss / Premium)
• Loss Rate (Aggregate Loss per unit of exposure)
24
An Example for Considering Variance Structure
An Example for Considering Variance Structure
Accident OnLevel Trended LossYear Premium Ult. Loss Ratio
1994 290,662 1,275,543 438.84%1995 391,490 47,490 12.13%1996 72,742,613 70,544,925 96.98%1997 265,124,454 161,625,762 60.96%1998 279,159,910 173,569,322 62.18%1999 339,612,341 246,497,223 72.58%2000 439,322,504 290,588,625 66.14%2001 469,582,172 327,742,407 69.79%2002 524,216,086 312,057,030 59.53%2003 869,036,055 689,968,152 79.39%
How would you calculate the mean and variance in these loss ratios?
25
Defining a Variance StructureDefining a Variance Structure
We intuitively know that variance changes with loss volume – but how?
This is the same as asking
“V() = ?”
26
Defining a Variance StructureDefining a Variance Structure
We want CV to decrease with loss size, but not too quickly. GLM provides several approaches:
• Negative Binomial Var(y) = · +(/k)·2
• Tweedie Var(y) = ·p 1<p<2
• Weighted L-S Var(y) = /w
27
The Negative BinomialThe Negative Binomial
The variance function:
Var(y) = · + (/k)·2
random systematic
variance variance
28
The “Tweedie” DistributionThe “Tweedie” Distribution
Tweedie Neg. Binomial
Frequency Poisson Poisson
Severity Gamma Logarithmic (exponential when p=1.5)
Both the Tweedie and the Negative Binomial can be thought of as intermediate cases between the Poisson and Gamma distributions.
29
Defining a Variance StructureDefining a Variance Structure
Negative Binomial
Tweedie
kCV
lim
0lim
CV
30
Defining a Variance StructureDefining a Variance StructureComparison of Negative Binomial and Tweedie CV's
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
100 1,000 10,000 100,000
Logarithm of Expected Loss Size
Co
effi
cien
t o
f V
aria
tio
n (
CV
)
Negative Binomial Tweedie (p=1.5)
Asymptotic to .200
Asymptotic to 0
31
Weighted Least-SquaresWeighted Least-Squares
Use Normal Distribution but set
a() = /wi
such that, variance is proportional to some external exposure weight wi.
This is equivalent to weighted least-squares: L-S = Σ(yi-i)2·wi
32
ConclusionConclusion
A model fitted to insurance data should reflect the variance structure of the phenomenon being modeled.
GLM provides a flexible tool for doing this.