Sample Size Calculation for Cluster Randomized Trials with Zero-inflated Count Outcomes

Zhengyang Zhou,Dateng Li,Song Zhang
DOI: https://doi.org/10.48550/arXiv.2009.10117
2020-09-22
Abstract:Cluster randomized trails (CRT) have been widely employed in medical and public health research. Many clinical count outcomes, such as the number of falls in nursing homes, exhibit excessive zero values. In the presence of zero inflation, traditional power analysis methods for count data based on Poisson or negative binomial distribution may be inadequate. In this study, we present a sample size method for CRTs with zero-inflated count outcomes. It is developed based on GEE regression directly modeling the marginal mean of a ZIP outcome, which avoids the challenge of testing two intervention effects under traditional modeling approaches. A closed-form sample size formula is derived which properly accounts for zero inflation, ICCs due to clustering, unbalanced randomization, and variability in cluster size. Robust approaches, including t-distribution-based approximation and Jackknife re-sampling variance estimator, are employed to enhance trial properties under small sample sizes. Extensive simulations are conducted to evaluate the performance of the proposed method. An application example is presented in a real clinical trial setting.
Applications,Methodology
What problem does this paper attempt to address?
This paper attempts to solve the sample size calculation problem in cluster - randomized trials (CRTs) when the outcome variable is zero - inflated count data. Traditional methods conduct power analysis based on the Poisson distribution or the negative binomial distribution, and these methods may be inadequate when dealing with data with a large number of zero values. This paper proposes a new sample size calculation method specifically for zero - inflated count data. This method is based on the generalized estimating equations (GEE) regression to directly model the marginal mean of the zero - inflated Poisson (ZIP) outcome. This method avoids the challenge of simultaneously testing two intervention effects in traditional modeling methods, and derives a closed - form sample size formula that can appropriately consider zero - inflation, intraclass correlation coefficients (ICCs) caused by clustering, unbalanced randomization, and changes in cluster size. In addition, robust methods such as the t - distribution - based approximation method and the Jackknife resampling variance estimator are also adopted to enhance the trial performance with small sample sizes. The performance of the proposed method is evaluated through extensive simulation studies, and an application example in an actual clinical trial is provided.