Gene selection of microarray data using heatmap analysis and Graph Neural Network

Soumen Kumar Pati,Ayan Banerjee,Sweta Manna
DOI: https://doi.org/10.1016/j.asoc.2023.110034
2023-01-20
Applied Soft Computing Journal
Abstract:It is not feasible to investigate the whole genes at a microscopic level for disease classification in Genomics. It might take substantial time to execute any meaningful analysis and the computational resources will be misused as not all the genes are responsible for the disease linked to a cell. Currently, it is quite challenging to select the most significant genes from high-dimensional microarray data for disease classification. In search of a better process, a novel gene subset selection technique has been developed based on Heatmap Analysis and Graph Neural Network ( HAGNN ). In the proposed method, a heatmap analysis has been performed for the different classes of microarray data to obtain the Region of Interest ( ROIs ). These ROIs are extracted from the original dataset and undergo a node reduction technique followed by an edge reduction technique in Graph Neural Network ( GNN ). This paper is concluded with an optimal subset of the most significant genes that cause cancer. The popular base classifiers have been used to evidence the importance of the selected genes as compared to the original data with the help of several metrics. The obtained results clearly show that the proposed methodology outperformed the other existing methods and make a greater impact on the advancement of the GNN -based gene selection method.
What problem does this paper attempt to address?