SympGAN: A systematic knowledge integration system for symptom-gene associations network
Kezhi Lu,Kuo Yang,Hailong Sun,Qian Zhang,Qiguang Zheng,Kuan Xu,Jianxin Chen,Xuezhong Zhou
DOI: https://doi.org/10.1016/j.knosys.2023.110752
IF: 8.139
2023-06-01
Knowledge-Based Systems
Abstract:Phenotypes (i.e., symptoms and clinical signs) are essential for clinical diagnosis and research related to symptom science and precision health. As clinical observational manifestations of a disease, symptoms are clinically significant because they act as direct causes for patients to seek medical care and the primary indicators for clinicians to provide diagnosis/treatments. However, a comprehensive phenotypic knowledge base and high-quality symptom–gene associations are lacking. Therefore, a thorough understanding of the relationships between symptoms and other entities is urgently needed to support scientific research and clinical health care. In this paper, we constructed a systematic, large-scale, and high-quality symptom-gene associations network system named SympGAN (accessible at http://www.sympgan.org/). We provide access to the database with millions of associations between symptoms, genes, diseases, and drugs, as well as the system for users to search, analyze, knowledge inference, and present data visualization. We utilize state-of-the-art machine learning and deep learning algorithms as the backbone to form the final dataset. In addition, we utilize the RoBERTa-PubMed neural network for name entity recognition to assist in data screening. The knowledge graph is adopted to organize the relationships between different entities. We adopt ConvE, TuckER, and HypER methods for knowledge completion experiments to validate the quality of final knowledge graph triples. Based on the results, we provide online automatic knowledge inference interfaces. The system, SympGAN, has promising value for disease diagnosis, decision support in health care, precision health, and scientific research, as researchers and practitioners can easily access information about symptoms, diseases, targets, gene ontology, and drugs.
computer science, artificial intelligence