Gene selection and clustering of single-cell data based on Fisher score and genetic algorithm

Junhong Feng,Jie Zhang,Xiaoshu Zhu,Jian-Hong Wang
DOI: https://doi.org/10.1007/s11227-022-04920-7
2022-11-25
Abstract:Huge amounts of genes in single-cell RNA sequencing (scRNA-seq) data may influence the performance of data clustering. To obtain high-quality genes for data clustering, the study proposes a novel gene selection algorithm based on Fisher score and genetic algorithms with dynamic crossover (abbreviated as FDCGA). To reduce time and space complexity, FDCGA first employs Fisher score to gain the preliminary candidate genes and then utilizes genetic algorithms with dynamic crossover to select beneficial genes to data clustering and analysis. The experimental results conducted on several publicly real-world scRNA-seq datasets demonstrate that FDCGA outperforms the other several competitors in terms of both NMI and ARI metrics and possesses significant optimization performances. The experimental convergence shows that the fitness of FDCGA can increase and converge to a fixed state versus the number of iterations. The statistical analysis demonstrates that FDCGA statistically significantly outperforms the other competing methods.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?