32
A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

Embed Size (px)

Citation preview

Page 1: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

A Primer on the Exponential Family of Distributions

A Primer on the Exponential Family of Distributions

David Clark & Charles Thayer

American Re-Insurance

GLM Call Paper - 2004

Page 2: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

2

AgendaAgenda

• Brief Introduction to GLM

• Overview of the Exponential Family

• Some Specific Distributions

• Suggestions for Insurance Applications

Page 3: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

3

Context for GLMContext for GLM

Linear Regression

Generalized Linear Models

Maximum Likelihood

XYE ][

Y~ Normal

XhYE ][

Y ~ Exponential Family

,][ XhYE

Y ~ Any Distribution

Page 4: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

4

Advantages over Linear Regression

Advantages over Linear Regression

• Instead of linear combination of covariates, we can use a function of a linear combination of covariates

• Response variable stays in original units

• Great flexibility in variance structure

Page 5: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

5

Transforming the Response versus

Transforming the Covariates

Transforming the Response versus

Transforming the Covariates

Linear Regression GLM

E[g(y)] = X· E[y] = g-1(X·)

Note that if g(y)=ln(y), then Linear Regression cannot handle any points where y0.

Page 6: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

6

Advantages of this Special Case of Maximum LikelihoodAdvantages of this Special

Case of Maximum Likelihood

• Pre-programmed in many software packages

• Direct calculation of standard errors of key parameters

• Convenient separation of Mean parameter from “nuisance” parameters

Page 7: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

7

Advantages of this Special Case of Maximum LikelihoodAdvantages of this Special

Case of Maximum Likelihood• GLM useful when theory immature,

but experience gives clues about:How mean response affected by

external influences, covariates

How variability relates to mean

Independence of observations

Skewness/symmetry of response distribution

Page 8: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

8

General Form of the Exponential FamilyGeneral Form of the Exponential Family

iiiiii yhgyedyf exp ;

Note that yi can be transformed with any function e().

Page 9: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

9

“Natural” Form of the Exponential Family

“Natural” Form of the Exponential Family

,exp , ; iiii

ii yca

byyf

Note that yi is no longer within a function. That is, e(yi)=yi.

Page 10: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

10

Specific Members of the Exponential Family

Specific Members of the Exponential Family

• Normal (Gaussian)

• Poisson

• Negative Binomial

• Gamma

• Inverse Gaussian

Page 11: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

11

Some Other Members of the Exponential Family

Some Other Members of the Exponential Family

• Natural FormBinomialLogarithmicCompound Poisson/Gamma (Tweedie)

• General Form [use ln(y) instead of y]LognormalSingle Parameter Pareto

Page 12: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

12

Normal DistributionNormal Distribution

Natural Form:

2ln

2

12/exp)(

22 y

yyf

The dispersion parameter, , is replaced with 2 in the more familiar form of the Normal Distribution.

Page 13: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

13

Poisson DistributionPoisson Distribution

Natural Form:

))!/ln((

ln)ln(exp)(Prob

yyy

yY

“Over-dispersed” Poisson allows 1.

Variance/Mean ratio =

Page 14: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

14

Negative Binomial DistributionNegative Binomial Distribution

Natural Form:

/

1ln/lnlnexp)(Prob

)(

yk

k

ky

kyY

yk

The parameter k must be selected by the user of the model.

Page 15: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

15

Gamma DistributionGamma Distribution

Natural Form:

)(ln)ln()1()ln(exp)(

y

yyf

Constant Coefficient of Variation (CV):

CV = -1/2

Page 16: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

16

Inverse Gaussian DistributionInverse Gaussian Distribution

Natural Form:

3

22ln

2

111

2exp)( y

y

yyf

Page 17: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

17

Table of Variance FunctionsTable of Variance Functions

Distribution Variance Function

Normal Var(y) = Poisson Var(y) = ·

Negative Binomial Var(y) = ·+(/k)·2

Gamma Var(y) = ·2

Inverse Gaussian Var(y) = ·3

Page 18: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

18

The Unit Variance FunctionThe Unit Variance Function

We define the “Unit Variance” function as

V() = Var(y) / a()

That is, =1 in the previous table.

Page 19: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

19

Uniqueness PropertyUniqueness Property

The unit variance function V() uniquely identifies its parent distribution type within the natural exponential family.

f(y) V()

Page 20: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

20

Table of Skewness CoefficientsTable of Skewness Coefficients

Distribution Skewness

Normal 0

Poisson CV

Negative Binomial [1+/(+k)]·CV

Gamma 2·CV

Inverse Gaussian 3·CV

Page 21: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

21

Graph of Skewness versus CVGraph of Skewness versus CV

0

1

2

3

4

5

6

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2

Coefficient of Variation (CV)

Co

effi

cien

t o

f S

kew

nes

s

NegativeBinomial

LogNormal

InverseGaussianGamma

Poisson

Normal

Page 22: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

22

The Big Question:The Big Question:

What should the variance function look like for insurance applications?

Page 23: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

23

What is the Response Variable?What is the Response Variable?

• Number of Claims

• Frequency (# claims per unit of exposure)

• Severity

• Aggregate Loss Dollars

• Loss Ratio (Aggregate Loss / Premium)

• Loss Rate (Aggregate Loss per unit of exposure)

Page 24: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

24

An Example for Considering Variance Structure

An Example for Considering Variance Structure

Accident OnLevel Trended LossYear Premium Ult. Loss Ratio

1994 290,662 1,275,543 438.84%1995 391,490 47,490 12.13%1996 72,742,613 70,544,925 96.98%1997 265,124,454 161,625,762 60.96%1998 279,159,910 173,569,322 62.18%1999 339,612,341 246,497,223 72.58%2000 439,322,504 290,588,625 66.14%2001 469,582,172 327,742,407 69.79%2002 524,216,086 312,057,030 59.53%2003 869,036,055 689,968,152 79.39%

How would you calculate the mean and variance in these loss ratios?

Page 25: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

25

Defining a Variance StructureDefining a Variance Structure

We intuitively know that variance changes with loss volume – but how?

This is the same as asking

“V() = ?”

Page 26: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

26

Defining a Variance StructureDefining a Variance Structure

We want CV to decrease with loss size, but not too quickly. GLM provides several approaches:

• Negative Binomial Var(y) = · +(/k)·2

• Tweedie Var(y) = ·p 1<p<2

• Weighted L-S Var(y) = /w

Page 27: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

27

The Negative BinomialThe Negative Binomial

The variance function:

Var(y) = · + (/k)·2

random systematic

variance variance

Page 28: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

28

The “Tweedie” DistributionThe “Tweedie” Distribution

Tweedie Neg. Binomial

Frequency Poisson Poisson

Severity Gamma Logarithmic (exponential when p=1.5)

Both the Tweedie and the Negative Binomial can be thought of as intermediate cases between the Poisson and Gamma distributions.

Page 29: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

29

Defining a Variance StructureDefining a Variance Structure

Negative Binomial

Tweedie

kCV

lim

0lim

CV

Page 30: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

30

Defining a Variance StructureDefining a Variance StructureComparison of Negative Binomial and Tweedie CV's

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

100 1,000 10,000 100,000

Logarithm of Expected Loss Size

Co

effi

cien

t o

f V

aria

tio

n (

CV

)

Negative Binomial Tweedie (p=1.5)

Asymptotic to .200

Asymptotic to 0

Page 31: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

31

Weighted Least-SquaresWeighted Least-Squares

Use Normal Distribution but set

a() = /wi

such that, variance is proportional to some external exposure weight wi.

This is equivalent to weighted least-squares: L-S = Σ(yi-i)2·wi

Page 32: A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper - 2004

32

ConclusionConclusion

A model fitted to insurance data should reflect the variance structure of the phenomenon being modeled.

GLM provides a flexible tool for doing this.