Critical assessment of missense variant effect predictors on disease-relevant variant data
Ruchir Rastogi,Ryan Chung,Sindy Li,Chang Li,Kyoungyeul Lee,Junwoo Woo,Dong-Wook Kim,Changwon Keum,Giulia Babbi,Pier Luigi Martelli,Castrense Savojardo,Rita Casadio,Kirsley Chennen,Thomas Weber,Olivier Poch,Francois Ancien,Gabriel Cia,Fabrizio Pucci,Daniele Raimondi,Wim Vranken,Marianne Rooman,Celine Marquet,Tobias Olenyi,Burkhard Rost,Gaia Andreoletti,Akash Kamandula,Yisu Peng,Constantina Bakolitsa,Matthew Mort,David N. Cooper,Timothy Bergquist,Vikas Pejaver,Xiaoming Liu,Predrag Radivojac,Steven E. Brenner,Nilah M. Ioannidis
DOI: https://doi.org/10.1101/2024.06.06.597828
2024-06-08
Abstract:Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction. To explore a variety of settings that are relevant for different clinical and research applications, we assess performance within different subsets of the evaluation data and within high-specificity and high-sensitivity regimes. We find strong performance of many predictors across multiple settings. Meta-predictors tend to outperform their constituent individual predictors; however, several individual predictors have performance similar to that of commonly used meta-predictors. The relative performance of predictors differs in high-specificity and high-sensitivity regimes, suggesting that different methods may be best suited to different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors supervised on pathogenicity labels from curated variant databases often learn label imbalances within genes. Overall, we find notable advances over the oldest and most cited missense variant effect predictors and continued improvements among the most recently developed tools, and the CAGI Annotate-All-Missense challenge (also termed the Missense Marathon) will continue to assess state-of-the-art methods as the field progresses. Together, our results help illuminate the current clinical and research utility of missense variant effect predictors and identify potential areas for future development.
Genetics