Statistical model for overdispersed count outcome with many zeros: an approach for direct marginal inference

Samuel Iddi,Kwabena Doku-Amponsah
DOI: https://doi.org/10.48550/arXiv.1504.00071
2015-04-01
Abstract:Marginalized models are in great demand by most researchers in the life sciences particularly in clinical trials, epidemiology, health-economics, surveys and many others since they allow generalization of inference to the entire population under study. For count data, standard procedures such as the Poisson regression and negative binomial model provide population average inference for model parameters. However, occurrence of excess zero counts and lack of independence in empirical data have necessitated their extension to accommodate these phenomena. These extensions, though useful, complicates interpretations of effects. For example, the zero-inflated Poisson model accounts for the presence of excess zeros but the parameter estimates do not have a direct marginal inferential ability as its base model, the Poisson model. Marginalizations due to the presence of excess zeros are underdeveloped though demand for such is interestingly high. The aim of this paper is to develop a marginalized model for zero-inflated univariate count outcome in the presence of overdispersion. Emphasis is placed on methodological development, efficient estimation of model parameters, implementation and application to two empirical studies. A simulation study is performed to assess the performance of the model. Results from the analysis of two case studies indicated that the refined procedure performs significantly better than models which do not simultaneously correct for overdispersion and presence of excess zero counts in terms of likelihood comparisons and AIC values. The simulation studies also supported these findings. In addition, the proposed technique yielded small biases and mean square errors for model parameters. To ensure that the proposed method enjoys widespread use, it is implemented using the SAS NLMIXED procedure with minimal coding efforts.
Methodology
What problem does this paper attempt to address?