Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 1 of 16
Go Back
Full Screen
Close
Quit
Double DoubleUsing Duality in Count Regression
Michael AndersonDepartment of Management Science and Statistics, College of Business
University of Texas at San Antonio
Joint Statistical Meetings – August 2011
Abstract
The COM-Poisson distribution of Conway and Maxwell[3] is used in weighted Poisson regression to model phenomenawhich are significantly under- or over-dispersed. This flexibility in modeling dispersion derives from the property ofduality inherent in the Poisson distribution, in distinction to most other discrete models. Other distributions alsopossess duality and also provide flexibility in modeling. Two will be discussed and examples of inference will be shown.
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 2 of 16
Go Back
Full Screen
Close
Quit
1. COM-Poisson
One technique for modeling over- or under-dispersion is to modify the Poisson distribution with a weighting function.Recent work by Shmueli, et. al.[9][8], Jowaheer and Khan[6], and Guikema and Goffelt[4] has popularized the COM-Poissondistribution. In this distribution, the weighting function appears as an exponent to the factorial in the denominator of thePMF:
Poisson P (k) =1
C
θk
k!C =
∑i≥0
θi
i!= eθ
COM-Poisson Pγ(k) =1
Cγ
θk
(k!)γCγ =
∑i≥0
θi
(i!)γ
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 3 of 16
Go Back
Full Screen
Close
Quit
The weighting exponent, γ > 0 allows thedistribution to be either under-dispersed (γ > 1)or over-dispersed (0 < γ < 1). The indexof dispersion is a function of both the locationparameter θ and the shape parameter γ and variessymmetrically around γ = 1.
Sellers and Shmueli[8] have had great success using this distribution in Poisson regression to model processes both over-and under-dispersed.
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 4 of 16
Go Back
Full Screen
Close
Quit
2. Duality and GHPDs
The key to the flexibility of the COM-Poisson distribution is the property of duality, as described by Kokonendji, Mizere,and Balakrishnan[7].
2.1. Over- and Under-Dispersion
Kokonedji, et. al. observed that there were several different weighting schemes that could be applied to the Poissondistribution that could model either over- or under-dispersion:
Pγ(i) =1
E[wi(γ)]
wi(γ)e−θθi
i!
size-biased wi ∝ e−γi
wi ∝ e−γ|i−θ| θ ≥ 0wi ∝ (i+ δ)−γ δ > 0
factorial biased wi ∝ (i!)−(γ−1) γ < 2
These weighting schemes all share the property of having dual weights such that
wi(+γ)× wi(−γ) = 1
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 5 of 16
Go Back
Full Screen
Close
Quit
2.2. GHPDs
The Poisson distribution is one of the simplest examples of a family of power series distributions known as the generalizedhypergeometric probability distributions (GHPDs). These distributions have probabilities which are normalized terms froma generalized hypergeometric function:
pFq[a1, a2, . . . ; b1, b2, . . . ; θ] =∑i≥0
(a1)i(a2)i · · · (ap)iθi
(b1)i(b2)i · · · (bq)ii!(a)i = a(a+ 1) · · · (a+ i− 1)
and probability generating functions which are the ratio of hypergeometric functions
Gx(z) =pFq[a1, a2, . . . , ap; b1, b2, . . . , bq; θz]
pFq[a1, a2, . . . , ap; b1, b2, . . . , bq; θ]
The GHPD family includes many well-known discrete distributions, including the Poisson, negative binomial, binomial, andhypergeometric distributions[5, p. 88]. These are a few examples:
name PGFbinomial C 1F0[−n;−;−λz] λ = p/(1− p)Poisson C 0F0[−;−;λz]displaced Poisson C 1F1[1; r + 1;λz]hyper-Poisson C 1F1[1;λ; θz]geometric C 1F0[1;−; qz]negative binomial C 1F0[k;−; qz]Polya-Eggenberger C 1F0[a;−; pz] a = h/θ θ = p/(1 = p)logarithmic C 2F1[1, 1; 2; θz]hypergeometric C 2F1[−n,−M ;N −M − n+ 1; z]
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 6 of 16
Go Back
Full Screen
Close
Quit
2.3. Duality Condition
The COM-weighted Poisson distribution can be thought of as a member of a variation of the GHPDs, the polyfactorialGHPDs, where a weighting exponent is applied to the factorial in the denominator of the series terms
pFγq [a1, a2, . . . ; b1, b2, . . . ; θ] =
∑i≥0
(a1)i(a2)i · · · θi
(b1)i(b2)i · · · (i!)γ
The COM-Poisson is the simplest of these, with
Gx(z) =0F
γ0 [−;−; θz]
0Fγ0 [−;−; θ]
Pγ(k) =1
0Fγ0 [−;−; θ]
θk
(k!)γ
Most polyfactorial GHPDs do not exhibit duality. A weighting exponent γ > 1 will reduce the dispersion of thedistribution, but an exponent γ < 1 will produce a series which does not converge.
However, in the case where the numerator order p does not exceed the denominator order q, the weightedhypergeometric series does converge, and these distributions show duality.
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 7 of 16
Go Back
Full Screen
Close
Quit
3. Hyper-Poisson
Based on the general duality condition, the next likely distribution after the Poisson is Bardwell and Crowe’s hyper-Poissondistribution[1] with PGF
Gx(z) =1F1[1;λ; θz]
1F1[1;λ; θ]P (k) =
1
1F1[1;λ; θ]
θk
(λ)k
where
1F1[1;λ; θ] =∑j≥0
(1)jθj
(λ)jj!=∑j≥0
θj
(λ)j
The λ is a shift parameter which displaces the distribution. (When λ is an integer, this is a shifted Poisson distribution.)
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 8 of 16
Go Back
Full Screen
Close
Quit
This distribution is interesting in that it has a property very similar to duality. When the shift parameter λ < 1, thedistribution is under-dispersed and is called sub-Poisson, while λ > 1 makes the distribution over-dispersed or super-Poisson. However, this is not true duality, since
(1− ε)i 6= (1 + ε)i and (λ)i 6=(
1
λ
)i
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 9 of 16
Go Back
Full Screen
Close
Quit
4. Weighted Hyper-Poisson
True duality can be introduced to the hyper-Poisson distribution with COM-type weighting, which gives a distribution thatcan be shifted and compressed or stretched.
Gx(z) =1F
γ1 [1;λ; θz]
1Fγ1 [1;λ; θ]
Pγ(k) =1
1Fγ1 [1;λ; θ]
θk
(λ)k(k!)γ−1
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 10 of 16
Go Back
Full Screen
Close
Quit
The index of dispersion can be adjusted by varying either of the shift (λ) or weight (γ) parameters:
Is this additional flexibility useful?
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 11 of 16
Go Back
Full Screen
Close
Quit
5. Count Regression
How do the hyper-Poisson and polyfactorial hyper-Poisson compare to the COM-Poisson in fitting regression models wherethere is over- or under-dispersion?
Sellers and Shmueli’s paper[8] provides an extensive comparison of COM-Poisson regression to (ordinary) Poissonregression and negative binomial regression. I’ve extended their comparisons to the hyper-Poisson distributions, unweightedand weighted.
COM-Poisson P (y | x) =1
0Fγ0 [−;−; θ(x)]
θ(x)y
(1)γy
hyper-Poisson P (y | x) =1
1F1[1;λ; θ(x)]
θ(x)y
(λ)y
pf hyper-Poisson P (y | x) =1
1Fγ1 [1;λ; θ(x)]
θ(x)y
(λ)y(1)γ−1y
θ(x) = eβ0+β1x
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 12 of 16
Go Back
Full Screen
Close
Quit
5.1. Data Sets
All three models were used to fit two data sets, one under-dispersed, the other over-dispersed. These data sets come fromSellers and Shmueli, to allow direct comparison with existing results.
Airfreight Breakage The number of broken ampules (out of 1000) in 10 air shipments. The predictor variable is thenumber of times the ampule carton is transferred between aircraft. This data set showed under-dispersion in Poissonregression.
Textile Faults The number of yarn breaks during each of 32 textile process runs. The predictor variable is the log of eachtextile roll. This data set showed over-dispersion in Poisson regression.
5.2. Results
All three models were fitted with maximum likelihood estimates using the NMaximize[Method→"NelderMead"] function inMathematica. Results for the COM-Poisson fits were nearly identical to those reported by Sellers and Shmueli.
Air Freight Fabric Rollsmodel MSE CAIC time MSE CAIC timeCOM-Poisson 1.9 47.198 4.727 21.80 191.03 8.783hyper-Poisson 2.6 55.716 3.198 21.99 195.35 40.903pf hyper-Poisson 1.9 50.341 24.960 20.83 192.52 65.579
Sellers and Shmueli are generous in their definition of MSE (no parameters penalty), but do use a highly penalized AIC:
MSE =1
n
n∑i=1
e2i CAIC = −2 ln(L) + k(1 + lnn)
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 13 of 16
Go Back
Full Screen
Close
Quit
The set of four models (Poisson, COM-Poisson, hyper-Poisson, and polyfactorial hyper-Poisson) form a network of nestedmodels which can be compared with likelihood ratio tests. It’s clear for the air freight data that the COM-Poisson modelis the most parsimonious model:
The small p-value for the test comparing the hyper-Poisson and pf-hyper-Poisson confirm that the γ-weighting is whatimproves the model fit.
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 14 of 16
Go Back
Full Screen
Close
Quit
The comparison for the textile faults data is similar, with the somewhat arresting indication that the addition of the λparameter to the COM-Poisson model, while reducing the MSE, actually has a negative LRT statistic.
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 15 of 16
Go Back
Full Screen
Close
Quit
5.3. Conclusions
• The hyper-Poisson and polyfactorial hyper-Poisson distributions can model over- and under-dispersion in countregression.
• In simple cases, they do not show any marked advantage over COM-Poisson regression.
“..it doesn’t take much to see that the problemsof three little people [models] don’t amount to ahill of beans in this crazy world.”–Rick, Casablanca
COM-Poisson
Duality
Hyper-Poisson
Weighted Hyper-Poisson
Count Regression
Home Page
Title Page
JJ II
J I
Page 16 of 16
Go Back
Full Screen
Close
Quit
References
[1] George E. Bardwell and Edwin L. Crow, A two-parameter family of hyper-poisson distributions, Journal of the AmericanStatistical Association 59 (1964), no. 305, 133–141.
[2] A. Colin Cameron and Privin K. Trivedi, Regression analysis of count data, Econometric Society Monographs, CambridgeUniversity Press, Cambridge, United Kingdom, 1998.
[3] R. W. Conway and W. L. Maxwell, A queueing model with state dependent service rates, Journal of Industrial Engineering12 (1962), 132–136.
[4] Seth D. Guikema and Jeremy P. Goffelt, A flexible count data regression model for risk analysis, Risk Analysis 28 (2008),no. 1, 213–223.
[5] Norman L. Johnson, Samuel Kotz, and Adrienne W. Kemp, Univariate discrete distributions, 2nd ed., John S. Wileyand Sons, 1993.
[6] Vanda Jowaheer and Muushad Khan, Estimating regression effects in com poisson generalized linear model, WorldAcademy of Science, Engineering and Technology 53 (2009), 1046–1050.
[7] Celestin C. Kokonendji, Dominique Mizere, and N. Balakrishnan, Connections of the Poisson weight function tooverdispersion and underdispersion, Journal of Statistical Planning and Inference 138 (2008), no. 5, 1287–1296.
[8] Kimberly F. Sellers and Galit Shmueli, A flexible regression model for count data, Tech. Report Research Paper No.RHS 06-060, Robert H. Smith School, December 4 2008.
[9] Galit Shmueli, Thomas P. Minka, Joseph B. Kadane, Sharad Borle, and Peter Boatwright, A useful distribution for fittingdiscrete data: Revival of the Conway-Maxwell-Poisson distribution, Journal of the Royal Statistical Society, Series C:Applied Statistics 54 (2005), no. 1, 127–142.