GMM-based Procedure for Multiple Hypotheses Testing

Jingyi Zhang,Zhijian He
DOI: https://doi.org/10.1080/03610918.2022.2082476
2024-01-01
Communications in Statistics - Simulation and Computation
Abstract:Multiple hypotheses testing has been widely studied in the literature due to its broad applicability, particularly in the fields of biogenetics and astrogeology. The false discovery rate (FDR) is a useful error control criterion for large-scale multiple hypotheses, which is loosely defined as the expected proportion of false positives among all rejected hypotheses. In this paper, we propose a Gaussian mixture model (GMM) to fit the distribution of the Z-value statistics, including the nulls distribution as a fixed component. The nulls proportion and the real nulls distribution are estimated by the fitted GMM simultaneously. A GMM-based procedure is then proposed to minimize the false nondiscovery rate (FNR) subject to a constraint on the FDR. Both simulations and real data analysis show that the GMM-based procedure performs considerably well comparing to some competitors.
What problem does this paper attempt to address?