Improving Variant Prioritization in Exome Analysis by Entropy-Weighted Ensemble of Multiple Tools.

Yanjie Fan,Ying Zhou,Huili Liu,Xiaomei Luo,Ting Xu,Yu Sun,Tingting Yang,Linlin Chen,Xuefan Gu,Yongguo Yu
DOI: https://doi.org/10.1111/cge.14257
2022-01-01
Clinical Genetics
Abstract:Variant prioritization is a crucial step in the analysis of exome and genome sequencing. Multiple phenotype-driven tools have been developed to automate the variant prioritization process, but the efficacy of these tools in clinical setting with fuzzy phenotypic information and whether ensemble of these tools could outperform single algorithm remains to be assessed. A large rare disease cohort with heterogeneous phenotypic information, including a primary cohort of 1614 patients and a replication cohort of 1904 patients referred to exome sequencing, were recruited to assess the efficacy of variant prioritization and their ensemble. Three freely available tools-Exomiser, Xrare, and DeepPVP-and their ensemble were evaluated. The performance of all three tools was influenced by the attributes of phenotypic input. When combining these three tools by weighted-sum entropy method (EWE3), the ensemble outperformed any single algorithm, achieving a rate of 78% diagnostic variants in top 3 (13% improvement over current best performer, compared to Exomiser: 63%, Xrare: 65%, and DeepPVP: 51%), 88% in top 10 and 96% in top 30. The results were replicated in another independent cohort. Our study supports using entropy-weighted ensemble of multiple tools to improve variant prioritization and accelerate molecular diagnosis in exome/genome sequencing.
What problem does this paper attempt to address?