Skip to contents

Fits a zero-inflated negative binomial model (via pscl::zeroinfl()) with separate count and zero-inflation components. Returns coefficients on the response scale, randomized quantile residuals, a dispersion ratio, and a diagnostic plot.

Usage

zeroinflNegbinGLM(formula, data, ziformula = NULL, ...)

Arguments

formula

A model formula for the count component (e.g. y ~ x1 + x2). The response must be a non-negative integer count.

data

A data frame containing the variables in formula (and ziformula if provided).

ziformula

A one-sided formula for the zero-inflation component (e.g. ~ x1). When NULL (default), the same right-hand side as formula is used for both components. Use ~ 1 for an intercept-only zero-inflation model.

...

Additional arguments passed to pscl::zeroinfl().

Value

An object of class c("zeroinflNegbinGLM", "zeroinflGLMfit", "countGLMfit"), a list with:

call

The matched call.

model

The underlying pscl::zeroinfl fit object.

theta

The estimated negative binomial dispersion parameter.

coefficients

A list with two data frames, each with columns term, exp.coef, lower.95, upper.95:

count

Exponentiated coefficients for the count component.

zero

Exponentiated coefficients for the zero-inflation component.

diagnostics

A list with rqr, dispersion_ratio, and plot (see poissonGLM() for details).

aic

AIC of the fitted model.

Details

Coefficient interpretation:

  • Count component: exponentiating a coefficient gives the multiplicative change in the expected count among non-structural-zero observations, for a one-unit increase in the predictor, adjusting for simultaneous linear changes in other predictors. For example, 1.5 means a 50% higher expected count.

  • Zero component: exponentiating a coefficient gives the multiplicative change in the odds of being a structural zero (vs. entering the count process) for a one-unit increase in the predictor.

When to use: Zero-inflated negative binomial is the most flexible of the four models — it handles both excess zeros and overdispersion in the non-zero counts. Prefer this over zeroinflPoissonGLM() when the non-zero counts themselves remain overdispersed after zero-inflation is accounted for.

Examples

df <- data.frame(
  y  = c(0L, 0L, 0L, 1L, 2L, 0L, 3L, 0L, 1L, 0L),
  x1 = c(1.2, -0.4, 0.8, -1.1, 2.0, 0.3, -0.9, 1.5, -0.2, 0.7)
)
fit <- zeroinflNegbinGLM(y ~ x1, data = df)
print(fit)
#> 
#> Call:
#> zeroinflNegbinGLM(formula = y ~ x1, data = df)
#> 
#> Model family: zeroinflNegbinGLM 
#> 
#> Count component (exponentiated coefficients):
#>         term exp.coef lower.95 upper.95
#>  (Intercept)   1.3408   0.4922   3.6523
#>           x1   0.9433   0.4383   2.0304
#> 
#> Zero-inflation component (exponentiated coefficients):
#>         term exp.coef lower.95 upper.95
#>  (Intercept)   0.6737   0.0820   5.5325
#>           x1   2.1030   0.4077  10.8486
#> 
#> Dispersion ratio: 2.0053
#> AIC: 31.60
plot(fit)


# Intercept-only zero component:
fit2 <- zeroinflNegbinGLM(y ~ x1, data = df, ziformula = ~ 1)