CADA: Phenotype-driven gene prioritization based on a case-enriched knowledge graph

Chengyao Peng,Simon Dieck,Alexander Schmid,Ashar Ahmad,Alexej Knaus,Maren Wenzel,Laura Mehnert,Birgit Zirn,Tobias Haack,Stephan Ossowski,Matias Wagner,Teresa Brunet,Nadja Ehmke,Magdalena Danyel,Stanislav Rosnev,Tom Kamphans,Guy Nadav,Nicole Fleischer,Holger Fröhlich,Peter Krawitz
DOI: https://doi.org/10.1101/2021.03.01.21251705
2021-03-02
Abstract:Abstract Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the human phenotype ontology (HPO), which is increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed, however, the performance of many such tools suffers from an overrepresentation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With CADA we built a knowledge-graph that is based on case annotations and disorder annotations and show that CADA exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Crucial in the design of our approach is the use of the growing amount of phenotypic information that diagnostic labs deposit in databases such as ClinVar. By this means CADA is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly.
What problem does this paper attempt to address?