Upload
osias
View
27
Download
0
Embed Size (px)
DESCRIPTION
Poisson & Negative Binomial Regression “Now I've got heartaches by the number, Troubles by the score, Every day you love me less, Each day I love you more” (Ray Price). Count Variables. Number of times a particular event occurs to each case, usually within a given: - PowerPoint PPT Presentation
Citation preview
Poisson & Negative Binomial Regression
“Now I've got heartaches by the number,Troubles by the score,
Every day you love me less,Each day I love you more” (Ray Price)
Count Variables
Number of times a particular event occurs to each case, usually within a given:Time period (e.g., number of hospital visits
per year)Population size (e.g., number of registered
sex offenders per 100,000 population), orGeographical area (e.g., number of divorces
per county or state) Whole numbers that can range from 0
through +
Count DVs
Number of hospital visits, outpatient visits, services used, divorces, arrests, criminal offenses, symptoms, placements, children fostered, children adopted
Overview
Poisson regressionBasic model for count DVs
Negative binomial regressionAlternative to Poisson regression
• Less restrictive assumptions, and so greater generality
Single (Dichotomous) IV Example
DV = number of foster children adopted IV = marital status, 0 = unmarried, 1 =
married N = 285 foster mothers
Is there a difference in the number of foster children adopted by unmarried and married foster mothers?
Distribution of Count DVs
Typically skewed positively with large percentage of 0 values
Number of Foster Children Adopted
Descriptive Statistics
Table 5.1
Why is a t-test for independent groups not appropriate here?
Strength & Direction of Relationships
Being married increased the mean number of children adopted by a factor of 1.47 (47%)
1.112 / .754 = 1.47
100(1.47 – 1.00) = 47%
Question & Answer
Is there a difference in the number of foster children adopted by unmarried and married foster mothers?Yes. The mean number of children adopted
by unmarried mothers is .75 and by married mothers 1.11. So, being married increased the mean number of children adopted by a factor of 1.47 (47%).
But, analysis incorrect because…
Exposure
Opportunity for event to occurLength of time, population size, geographical
area, or other domain of interest
Number of years fostering varied across mothers, and so opportunity to adopt foster children variedUnmarried mothers, M = 8.803Married mothers, M = 7.254
Rate
Count per unit of…Time (e.g., number of children adopted per
year)Population (e.g., number of registered sex
offenders per 100,000)Geographical area (e.g., number of children
below the poverty rate per state)
Rate (cont’d)
= / E
(lambda), mean population rate• Sometimes referred to as the incidence rate
(mu), mean population count• Sometimes referred to as incidence
E, exposure
Rate (cont’d)
Example
rateUnmarried = .754 / 8.803 = .086• .086 children adopted yearly (rate)
rateMarried = 1.112 / 7.254 = .153• .153 children adopted yearly (rate)
Incidence Rate Ratio (IRR)
IRR = Married / Unmarried
Quantifies the direction and strength of relationship between IVs and DV
Being married increased the yearly adoption rate by a factor of 1.78 (78%)
• 153 / .086 = 1.78• 100(1.78 – 1.00)
Incidence Rate Ratio (IRR) (cont’d) IRR = 1
Numerator group and denominator group have same incidence rate
IRR > 1Numerator group has a higher incidence
rate than denominator group IRR < 1
Numerator group has a lower incidence rate than the denominator group
Potential range from 0 through +
Comparing IRR > 1 & IRR < IRR > 1 & IRR < 11 Compute reciprocal of one of the IRRs
e.g., IRR of 2.00 and an IRR of .50
Reciprocal of .50 is 2.00 (1 / .50 = 2.00) IRRs are equal in size (but not in direction of
the relationship)
Question & Answer
Is there a difference in the number of foster children adopted by unmarried and married foster mothers?Yes. The yearly adoption rate for unmarried
mothers is .09 and for married mothers .15. So, being married increased the yearly adoption rate by a factor of 1.78 (78%).
Poisson Regression
Single (Dichotomous) IV Example (ignoring exposure)
DV = number of foster children adopted IV = marital status, 0 = unmarried, 1 =
married N = 285 foster mothers
Is there a difference in the number of foster children adopted by unmarried and married foster mothers?
Statistical Significance
Tables 5.2, 5.3Relationship between marital status and
children adopted is statistically significant (Wald 2 = 5.846, p = .016)
H0: = 0, 0, ≤ 0, same as
H0: IRR = 1, IRR 0, IRR ≤ 0Likelihood ratio 2 better than Wald
Slope
B = slopePositive slope, positive relationship
• IRR > 1
Negative slope, negative relationship• IRR < 1
0 slope, no linear relationship• IRR = 1
Slope (cont’d)
B = .388 Positive relationship between marital
status and children adoptedMarried mothers adopt more children
IRR & Percentage Change
Exp(B) = IRR = 1.474 % change = 100(1.474 - 1) = 47%
Married mothers adopt more childrenBeing married increased the yearly adoption
rate by a factor of 1.47 (47%)
Poisson Model
ln() = α + 1X1 + 1X2 + … kXk, or ln() =
ln(), log of mean count (“log link”)e.g., log of mean number of children adopted
, abbreviation for linear predictor (right hand side of this equation)
k = number of independent variables
Inverse (reverse) Link
= e
is the mean count• e.g., mean number of children adopted
ln() to
ln(mean) = -.282 + (.388)(XMarried)
Single mothers ln(mean) = -.282 + (.338)(0) = -.282mean = e-.282 = .754mean = .75 children adopted
Married mothers ln(mean) = -.282 + (.388)(1) = .106mean = e.106 = 1.112mean = 1.11 children adopted
Question & Answer
Is there a difference in the number of foster children adopted by unmarried and married foster mothers?Yes. The mean number of children adopted
by unmarried mothers is .75 and by married mothers 1.11. So, being married increased the mean number of children adopted by a factor of 1.47 (47%).
But, analysis incorrect because…
Single (Dichotomous) IV Example (with exposure)
Use SPSS to create an “offset” variableNatural log of the exposure variable
• Exposure variable must be > 0compute lnYearsFostered =
ln(YearsFostered).
Enter offset variable into the regression analysis
Statistical Significance
Tables 5.4, 5.5Relationship between marital status and
yearly adopton rate is statistically significant (Wald 2 = 13.131, p < .001)
IRR & Percentage Change
Exp(B) = IRR = 1.789 % change = 100(1.789 - 1) = 79%
Married mothers adopt more children per year
Being married increased the yearly adoption rate by a factor of 1.79 (79%)
ln() to
ln(rate) = -2.457 + (.582)(XMarried)
Single mothers ln(rate) = -2.457 + (.582)(0) = -2.457 rate = e-2.457 = .086 .09 children adopted yearly (rate)
Married mothers ln(rate) = -2.457 + (.582)(1) = -1.875 rate = e-1.875 = .153 .15 children adopted yearly (rate)
Roadmap to Computations
Log of Ratesln() =
Rates = e
IRR(1) / (0)
% change100(IRR - 1)
Question & Answer
Is there a difference in the number of foster children adopted by unmarried and married foster mothers?Yes. The yearly adoption rate for unmarried
mothers is .09 and for married mothers .15. So, being married increased the yearly
adoption rate by a factor of 1.79 (79%).
Single (Quantitative) IV Example DV = number of foster children adopted IV = Perceived responsibility for
parenting (scale scores transformed to z-scores)
Offset variable = log of years fostered N = 285 foster mothers
Do foster mothers who feel a greater responsibility to parent foster children adopt more foster children?
Statistical Significance
Tables 5.6, 5.7Relationship between parenting
responsibility and yearly adoption rate is statistically significant (Wald 2 = 10.045, p = .002)
IRR & Percentage Change
Exp(B) = IRR = 1.202 % change = 100(1.202 - 1) = 20%
Mothers with greater parenting responsibility adopt more children per year
For every one-standard deviaiton increase in parenting responsibility the yearly adoption rate increases by a factor of 1.20 (20%)
ln() to
ln(rate) = -2.008 + (.184)(XzParentRole)
e.g., mean parenting responsibility (z = 0): ln(rate) = -2.008 + (.184)(0) = -2.008 rate = e-2.008 = .13 .13 children adopted yearly (rate)
Figure
zParentRole.xls
Effect of Standardized Parenting Responsibility on Adoption Rate
0.00
0.05
0.10
0.15
0.20
0.25
Standardized Parenting Responsibility
Ado
ptio
n R
ate
Rate 0.08 0.09 0.11 0.13 0.16 0.19 0.23
-3 -2 -1 0 1 2 3
Question & Answer
Do foster mothers who feel a greater responsibility to parent foster children adopt more foster children?Yes. For every one-standard deviation
increase in parenting responsibility the yearly adoption rate increases by a factor of 1.20 (20%). The yearly adoption rate is .09 for mothers two standard deviations below the mean, .13 for mothers with the mean, and .19 for mothers two standard deviations above the mean.
Multiple IV Example
DV = number of foster children adopted IV = Perceived responsibility for parenting (scale
scores transformed to z-scores) IV = marital status, 0 = unmarried, 1 = married Offset variable = log of years fostered N = 285 foster mothers
Do foster mothers who take more responsibility for parenting adopt more foster children per year, controlling for marital status?
Statistical Significance
Table 5.8Relationship between set of IVs and
yearly adoption rate is statistically significant (2 = 27.792, p < .001)
H0: 1 = 2 = k = 0, same as
H0: IRR1 = IRR2 = IRRk = 1
Statistical Significance
Table 5.9Relationship between parenting
responsibility and yearly adoption rate is statistically significant, controlling for marital status (2 = 11.853, p = .001)
Relationship between marital status and yearly adoption rate is statistically significant, controlling for parenting responsibility (2 = 16.520, p < .001)
Statistical Significance
Table 5.10Relationship between parenting
responsibility and yearly adoption rate is statistically significant, controlling for marital status (Wald 2 = 11.576, p = .001)
Relationship between marital status and yearly adoption rate is statistically significant, controlling for parenting responsibility (Wald 2 = 14.433, p < .001)
IRR & Percentage Change: Parenting Responsibility Exp(B) = IRR = 1.219 % change = 100(1.219 - 1) = 22%
Mothers with greater parenting responsibility adopt more children per year, controlling for marital status
For every one-standard deviaiton increase in parenting responsibility the yearly adoption rate increases by a factor of 1.22 (22%), controlling for marital status
IRR & Percentage Change: Marital Status
Exp(B) = IRR = 1.842 % change = 100(1.842 - 1) = 84%
Married mothers adopt more children per year, controlling for parenting responsibility
Being married increased the yearly adoption rate by a factor of 1.84 (84%), controlling for parenting responsibility
ln() to
ln(rate) = -2.498 + (.198)(XzParentRole) + (.611)(XMarried)
e.g., mean parenting responsibility (z = 0) and unmarried mothers: ln(rate) = -2.498 + (.198)(0) + (.611)(0) = -
2.498 rate = e-2.498 = .08 .08 children adopted yearly (rate)
Figure
Married & zParentRole.xls
Effect of Standardized Parenting Responsibility and Marital Status on Adoption Rate
0.00
0.05
0.10
0.15
0.20
0.25
0.30
Standardized Parenting Responsibility
Ado
ptio
n R
ate
Unmarried 0.05 0.06 0.07 0.08 0.10 0.12 0.15
Married 0.08 0.10 0.12 0.15 0.18 0.23 0.27
-3 -2 -1 0 1 2 3
Question & Answer
Do foster mothers who take more responsibility for parenting adopt more foster children per year, controlling for marital status? Yes. For every one-standard deviation increase in
parenting responsibility the yearly adoption rate increases by a factor of 1.22 (22%), controlling for marital status.
Cont’d
Question & Answer
Do foster mothers who take more responsibility for parenting adopt more foster children per year, controlling for marital status? For unmarried mothers the yearly adoption rate
is .06 for mothers two standard deviations below the mean, .08 for mothers with the mean, and .12 for mothers two standard deviations above the mean.
For umarried mothers the yearly adoption rate is .10 for mothers two standard deviations below the mean, .15 for mothers with the mean, and .23 for mothers two standard deviations above the mean.
Assumptions Necessary for Testing Hypotheses Equidispersion—variance equals the mean
Underdispersion—variance less than the meanOverdispersion—variance larger than the mean
• Typical• Overdispersion may lead us to believe that IVs are
statistically significant when in fact they are not• Overdispersion can result from outliers and exclusion
of relevant IVs, interaction terms, curvilinear terms and numerous other factors
Assumptions Necessary for Testing Hypotheses (cont’d)
Assumptions discussed in GZLM lecture See below concerning underdispersion, zero-
inflation, censoring, truncation
Negative Binomial Regression
Negative Binomial Regression Extension of Poisson regression Allows overdispersion (but not
underdispersion) Standard method used to model
overdispersed Poisson data Given that overdispersion is the norm,
the negative binomial model has more generality than the Poisson model
Multiple IV Example
DV = number of foster children adopted IV = Perceived responsibility for parenting (scale
scores transformed to z-scores) IV = marital status, 0 = unmarried, 1 = married Offset variable = log of years fostered N = 285 foster mothers
Do foster mothers who take more responsibility for parenting adopt more foster children per year, controlling for marital status?
Test for Overdispersion
Estimate a negative binomial regression Negative binomial regression adds an
ancillary parameter that allows overdispersion (but not underdispersion)
Test for Overdispersion (cont’d)
Ancillary parameter directly related to amount of overdispersion If data are not overdispersed ancillary
parameter equals 0Poisson regression is a negative binomial
regression with an ancillary parameter of 0Larger values indicate more overdispersion
• Values typically range from 0 to about 4
Test for Overdispersion (cont’d)
Table 5.14Test of the null hypothesis that ancillary
parameter equals 0Rejection of this null hypothesis indicates
overdispersion• p = .029 for alternative hypothesis that ancillary
parameter > 0, so reject
Negative binomial regression used when overdispersion
Statistical Significance
Table 5.15Relationship between set of IVs and
yearly adoption rate is statistically significant (2 = 8.68, p = .013)
H0: 1 = 2 = k = 0, same as
H0: IRR1 = IRR2 = IRRk = 1
Statistical Significance
Table 5.16Relationship between parenting
responsibility and yearly adoption rate is statistically significant, controlling for marital status (2 = 4.854, p = .028)
Relationship between marital status and yearly adoption rate is statistically significant, controlling for parenting responsibility (2 = 4.710, p = .030)
Statistical Significance
Table 5.17Relationship between parenting
responsibility and yearly adoption rate is statistically significant, controlling for marital status (Wald 2 = 4.917, p = .027)
Relationship between marital status and yearly adoption rate is statistically significant, controlling for parenting responsibility (Wald 2 = 4.845, p < .028)
IRR & Percentage Change: Parenting Responsibility Exp(B) = IRR = 1.254 % change = 100(1.254 - 1) = 25%
Mothers with greater parenting responsibility adopt more children per year, controlling for marital status
For every one-standard deviaiton increase in parenting responsibility the yearly adoption rate increases by a factor of 1.25 (25%), controlling for marital status
IRR & Percentage Change: Marital Status
Exp(B) = IRR = 1.760 % change = 100(1.760 - 1) = 76%
Married mothers adopt more children per year, controlling for parenting responsibility
Being married increased the yearly adoption rate by a factor of 1.76 (76%), controlling for parenting responsibility
ln() to
ln(rate) = -2.256 + (.227)(XzParentRole) + (.565)(XMarried)
e.g., mean parenting responsibility (z = 0) and unmarried mothers: ln(rate) = -2.256 + (.227)(0) + (.565)(0) = -
2.256 rate = e-2.256 = .10 .10 children adopted yearly (rate)
Figure
(NB) Married & zParentRole.xls
Effect of Standardized Parenting Responsibility & Marital Status on Adoption Rate
0.00
0.10
0.20
0.30
0.40
Standardized Parenting Responsibility
Ado
ptio
n R
ate
Unmarried 0.05 0.07 0.08 0.10 0.13 0.16 0.21
Married 0.09 0.12 0.15 0.18 0.23 0.29 0.36
-3 -2 -1 0 1 2 3
Question & Answer
Do foster mothers who take more responsibility for parenting adopt more foster children per year, controlling for marital status? Yes. For every one-standard deviation increase in
parenting responsibility the yearly adoption rate increases by a factor of 1.25 (25%), controlling for marital status.
Cont’d
Question & Answer
Do foster mothers who take more responsibility for parenting adopt more foster children per year, controlling for marital status? For unmarried mothers the yearly adoption rate
is .07 for mothers two standard deviations below the mean, .10 for mothers with the mean, and .16 for mothers two standard deviations above the mean.
For umarried mothers the yearly adoption rate is .12 for mothers two standard deviations below the mean, .18 for mothers with the mean, and .29 for mothers two standard deviations above the mean.
Assumptions Necessary for Testing Hypotheses Assumptions discussed in GZLM lecture See below concerning underdispersion,
zero-inflation, censoring, truncation
Model Evaluation
Index plotsLeverage valuesStandardized or unstandardized deviance
residualsCook’s D
Graph and compare observed and estimated counts
Analogs of RAnalogs of R22
None in standard use and each may give different results
Typically much smaller than R2 values in linear regression
Difficult to interpret
Multicollinearity
SPSS GZLM doesn’t compute multicollinearity statistics
Use SPSS linear regression Problematic levels
Tolerance < .10 or VIF > 10
Additional Topics
Polytomous IVs Curvilinear relationships Interactions
Overview of the Process
Select IVs and decide whether to test curvilinear relationships or interactions
Carefully screen and clean data Transform and code variables as
needed Estimate regression model Examine assumptions necessary to
estimate Poisson or negative binomial regression model, examine model fit, and revise model as needed
Overview of the Process (cont’d)
Test hypotheses about the overall model and specific model parameters, such as IRRs
Create tables and graphs to present results in the most meaningful and parsimonious way
Interpret results of the estimated model in terms of rates and IRRs, as appropriate
Additional Regression Models for Count DVs
Generalized Poisson modelData are under- or overdispersed
Poisson and negative binomial models for truncated samplesTruncation occurs when cases from the
population of interest are excluded based on characteristics of the DVe.g., Cases with zero counts are excluded (e.g.,
only mothers who adopted one or more children are included in the sample)
Additional Regression Models for Count DVs (cont’d) Zero-inflated Poisson and negative
binomial models and Hurdle modelsMix of two processes in the count variable,
one that generates only zero counts, and another that generates both zero and positive counts
• e.g., Some parents might not adopt because they are not interested in adopting (a process that generates only zero counts), and some parents might want to adopt but have not had the opportunity (a process that generates both zero and positive counts)
Additional Regression Models for Count DVs (cont’d) Poisson and negative binomial models
for censored DVsCensored variables are variables whose
values are known over some range, but unknown beyond a certain value because they were recorded or collected only up (or down) to a certain value
• e.g., Number of contacts between foster children and their biological parents measured as 0, 1, 2, or 3 or more per month are censored from above