Abstract:As the volume and complexity of data continue to expand across various scientific disciplines, the need for robust methods to account for the multiplicity of comparisons has grown widespread. A popular measure of type 1 error rate in multiple testing literature is the false discovery rate (FDR). The FDR provides a powerful and practical approach to large-scale multiple testing and has been successfully used in a wide range of applications. The concept of FDR has gained wide acceptance in the statistical community and various methods has been proposed to control the FDR. In this work, we review the latest developments in FDR control methodologies. We also develop a conceptual framework to better describe this vast literature; understand its intuition and key ideas; and provide guidance for the researcher interested in both the application and development of the methodology.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in large - scale multiple hypothesis testing, how to effectively control the False Discovery Rate (FDR). With the continuous increase in the amount and complexity of data, the multiple comparison problem has become more and more common, which may lead to non - reproducibility of results, publication bias, and the p - hacking phenomenon in scientific research. Therefore, researchers need powerful methods to deal with this multiplicity problem. Specifically, this paper mainly focuses on the following points: 1. **Definition and Importance of FDR**: - FDR refers to the expected value of the proportion of false discoveries among all discoveries, that is: \[ \text{FDR}(R)=E\left[\frac{|R \cap H_0|}{|R| \vee 1}\right] \] where \(R\) is the set of rejected hypotheses and \(H_0\) is the set of all null hypotheses. 2. **Methodological Development of FDR Control**: - The paper reviews the latest FDR control methods and proposes a general framework to describe these methods, helping researchers understand the intuition and key ideas behind them. - It introduces and discusses classic methods such as the Benjamini - Hochberg (BH) procedure and its variants, the Sun - Cai (SC) procedure and its variants. 3. **FDR Control under Dependence Structures**: - In practical applications, the dependence relationships between hypothesis tests often exist, which poses a challenge to FDR control. The paper explores FDR control methods under different dependence structures, including positive regression dependence sets (PRDS), weak dependence, factor models, etc. 4. **Utilization of Auxiliary Information**: - Researchers can often use additional information (such as covariates) to improve FDR control methods. The paper introduces how to integrate this auxiliary information into the FDR control process to improve the detection ability. 5. **Application of e - value**: - e - value is a non - negative random variable that satisfies \(E[E] \leq 1\) under the null hypothesis. The paper discusses the application of e - value in FDR control, especially its effectiveness under arbitrary dependence conditions. In summary, this paper aims to provide researchers with a comprehensive review of FDR control methods and propose new insights and frameworks to deal with the increasingly complex multiple hypothesis testing problems in modern scientific research.

False Discovery Control in Multiple Testing: A Brief Overview of Theories and Methodologies

A practical guide to methods controlling false discoveries in computational biology

Large-scale Multiple Testing: Fundamental Limits of False Discovery Rate Control and Compound Oracle

Online control of the false discovery rate in biomedical research

A New Procedure for Controlling False Discovery Rate in Large-Scale t-tests

Null-free False Discovery Rate Control Using Decoy Permutations

Sequential tests of multiple hypotheses controlling false discovery and nondiscovery rates

On stepdown control of the false discovery proportion

Controlling the Rate of GWAS False Discoveries

Large-scale adaptive multiple testing for sequential data controlling false discovery and nondiscovery rates

Directional false discovery rate control in large-scale multiple comparisons

Asymptotic uncertainty of false discovery proportion

Optimal False Discovery Rate Control for Large Scale Multiple Testing with Auxiliary Information

A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence

On a generalized false discovery rate

Optimal control of false discovery criteria in the two-group model

Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings

Online multiple testing with e-values

A framework for Multi-A(rmed)/B(andit) testing with online FDR control

Power-enhanced multiple decision functions controlling family-wise error and false discovery rates

Controlling false discovery rate for mediator selection in high-dimensional data