Bioinformatics approach to identify the hub gene associated with COVID-19 and idiopathic pulmonary fibrosis

Wenchao Shi,Tinghui Li,Huiwen Li,Juan Ren,Meiyu Lv,Qi Wang,Yaowu He,Yao Yu,Lijie Liu,Shoude Jin,Hong Chen
DOI: https://doi.org/10.1049/syb2.12080
Abstract:The coronavirus disease 2019 (COVID-19) has developed into a global health crisis. Pulmonary fibrosis, as one of the complications of SARS-CoV-2 infection, deserves attention. As COVID-19 is a new clinical entity that is constantly evolving, and many aspects of disease are remain unknown. The datasets of COVID-19 and idiopathic pulmonary fibrosis were obtained from the Gene Expression Omnibus. The hub genes were screened out using the Random Forest (RF) algorithm depending on the severity of patients with COVID-19. A risk prediction model was developed to assess the prognosis of patients infected with SARS-CoV-2, which was evaluated by another dataset. Six genes (named NELL2, GPR183, S100A8, ALPL, CD177, and IL1R2) may be associated with the development of PF in patients with severe SARS-CoV-2 infection. S100A8 is thought to be an important target gene that is closely associated with COVID-19 and pulmonary fibrosis. Construction of a neural network model was successfully predicted the prognosis of patients with COVID-19. With the increasing availability of COVID-19 datasets, bioinformatic methods can provide possible predictive targets for the diagnosis, treatment, and prognosis of the disease and show intervention directions for the development of clinical drugs and vaccines.
What problem does this paper attempt to address?