Hierarchical System of Gene Selection Based on Deep Learning and Ensemble Approach

S. Osowski,Dominik Seweryn
DOI: https://doi.org/10.1109/IJCNN52387.2021.9533956
2021-07-18
Abstract:The paper presents the hierarchical approach to detecting the most relevant genes (treated as biomarkers) stored in a dataset of gene expression microarrays of cancer data. The first stage of this approach is the application of a convolutional neural network cooperating with gradient weighted class activation mapping (Grad CAM) to the family of many cancer types. The results of this stage show the regions of the matrix with the highest activated genes for individual cancer families. The next stage consists of many classical feature selection methods applied separately to the data representing each cancer type. They are arranged in the form of an ensemble and work on the limited number of features selected in the first stage of the gene microarray analysis. In the final stage, the integration of the results of all selection methods is performed. The results of this fusion define the final selected set of the most relevant genes. The main contribution of the work is the application of deep learning in the first stage of selection, supported next by multiple analyses of its results and ended by choosing a very small set of the most significant genes for each family of cancer. Such genes can be treated as biomarkers of the considered cancer type.
Medicine,Computer Science
What problem does this paper attempt to address?