PrimateAI-3D outperforms AlphaMissense in real-world cohorts

David A. Parry,Tom Bosc,Tobias Hamp,Petko P. Fiziev,Abhishek Sharma,Irfahan Kassam,Jeremy McRae,Kyle Kai-How Farh
DOI: https://doi.org/10.1101/2024.01.12.24301193
2024-01-13
Abstract:Accurately predicting the impact of genetic variants is essential for interpreting genomic data, yet no consensus exists on how to measure classifier performance. We prepared the most comprehensive set of benchmarks to date and applied them to the recently published models PrimateAI-3D and AlphaMissense. PrimateAI-3D outperforms AlphaMissense on rare-disease cohort and biobank benchmarks, indicating that performance on clinical databases or assays does not reliably generalize to real-world cohorts.
Genetic and Genomic Medicine
What problem does this paper attempt to address?
The paper aims to address the issue of accurately predicting the impact of genetic variants on human health. Specifically, the authors evaluate the practical effectiveness of two state-of-the-art computational models—PrimateAI-3D and AlphaMissense—by comparing their performance across multiple real-world cohorts. ### Main Issues: 1. **Accurately predicting the impact of genetic variants**: Genetic variants are a major factor in the differences in disease susceptibility among individuals, but the clinical significance of most potential missense variants remains unclear. Therefore, developing computational algorithms that accurately predict the effects of genetic variants is crucial. 2. **Lack of a unified performance evaluation standard**: Although various computational algorithms have been developed, there is currently no unified standard to assess their accuracy, which hinders further progress in the field. 3. **Limitations of existing benchmark datasets**: Existing benchmark datasets, such as ClinVar and HGMD, have various biases and circularity issues, leading to potentially overly optimistic model performance on these datasets that may not generalize well to real-world data. ### Solutions: - **Comprehensive benchmarking**: The authors prepared the most comprehensive set of benchmarks to date, including extensive data from rare disease cohorts, biobanks, and the UK Biobank pharmacoproteomics project. - **Comparison of PrimateAI-3D and AlphaMissense**: Through these benchmarks, the authors conducted a detailed comparison of PrimateAI-3D and AlphaMissense across different datasets, particularly focusing on their performance in real-world cohorts. - **New proteomics benchmark**: A new UK Biobank-based proteomics benchmark was introduced, containing 29.3 million unique protein measurements for 701 genes from 41,836 individuals, to evaluate model performance on actual data. ### Conclusions: - **Superior performance of PrimateAI-3D**: PrimateAI-3D outperformed AlphaMissense in all real-world cohorts, especially in rare disease cohorts and biobank benchmarks. - **Importance of model training data**: By training with a combination of human and non-human primate variant data, PrimateAI-3D's performance was further enhanced, indicating that diverse training data can improve model generalizability. - **Independence of benchmark testing**: The importance of conducting benchmark tests on datasets independent of manually curated clinical labels was emphasized to avoid biases and circularity issues. In summary, the paper demonstrates the superior performance of PrimateAI-3D on real-world data through comprehensive benchmarking and highlights practical considerations for developing and evaluating genetic variant prediction models.