Fits a negative binomial GLM (via MASS::glm.nb()) and returns model
coefficients on the response scale (exponentiated), randomized quantile
residuals (RQR), a Pearson dispersion ratio, and a two-panel diagnostic
plot.
Arguments
- formula
A model formula (e.g.
y ~ x1 + x2). The response must be a non-negative integer count variable.- data
A data frame containing the variables in
formula.- ...
Additional arguments passed to
MASS::glm.nb().
Value
An object of class c("negbinGLM", "countGLMfit"), a list with:
callThe matched call.
modelThe underlying MASS::glm.nb fit object.
thetaThe estimated negative binomial dispersion parameter (smaller values indicate more overdispersion).
coefficientsA data frame with columns
term,exp.coef,lower.95,upper.95(all on the response/exponentiated scale).diagnosticsA list with:
rqrNumeric vector of randomized quantile residuals.
dispersion_ratioPearson chi-squared / df.residual. For a well-fitted negative binomial model this should be near 1.
plotA patchwork ggplot: fitted values vs RQR (left) and normal Q-Q of RQR (right).
aicAIC of the fitted model.
Details
Coefficient interpretation: Like Poisson regression, negative binomial regression models the log of the expected count. Exponentiating a coefficient gives the multiplicative change in the expected count for a one-unit increase in the predictor, adjusting for simultaneous linear changes in other predictors. For example, 1.5 means a 50% higher expected count.
When to use: Negative binomial is appropriate when count data show
overdispersion (variance > mean). A Pearson dispersion ratio from
poissonGLM() substantially above 1 (rule of thumb: > 1.5) is a common
signal. The negative binomial adds a free parameter theta to model this
extra variance.
Examples
df <- data.frame(
y = c(0L, 1L, 2L, 3L, 5L, 0L, 2L, 4L, 1L, 3L),
x1 = c(1.2, -0.4, 0.8, -1.1, 2.0, 0.3, -0.9, 1.5, -0.2, 0.7)
)
fit <- negbinGLM(y ~ x1, data = df)
#> Warning: iteration limit reached
#> Warning: iteration limit reached
print(fit)
#>
#> Call:
#> negbinGLM(formula = y ~ x1, data = df)
#>
#> Model family: negbinGLM
#>
#> Coefficients (on response scale):
#> term exp.coef lower.95 upper.95
#> (Intercept) 1.7989 1.0683 3.0291
#> x1 1.3396 0.8571 2.0936
#>
#> Dispersion ratio: 1.1657
#> AIC: 41.02
plot(fit)