Scfed: Clustering Identifying Cell Types of Scrna-Seq Data Based on Feature Engineering Denoising

Yang Liu,Feng Li,Junliang Shang,Jinxing Liu,Juan Wang,Daohui Ge
DOI: https://doi.org/10.1007/s12539-023-00574-y
2023-01-01
Interdisciplinary Sciences Computational Life Sciences
Abstract:Recently developed single-cell RNA-seq (scRNA-seq) technology has given researchers the chance to investigate single-cell level of disease development. Clustering is one of the most essential strategies for analyzing scRNA-seq data. Choosing high-quality feature sets can significantly enhance the outcomes of single-cell clustering and classification. But computationally burdensome and highly expressed genes cannot afford a stabilized and predictive feature set for technical reasons. In this study, we introduce scFED, a feature-engineered gene selection framework. scFED identifies prospective feature sets to eliminate the noise fluctuation. And fuse them with existing knowledge from the tissue-specific cellular taxonomy reference database (CellMatch) to avoid the influence of subjective factors. Then present a reconstruction approach for noise reduction and crucial information amplification. We apply scFED on four genuine single-cell datasets and compare it with other techniques. According to the results, scFED improves clustering, decreases dimension of the scRNA-seq data, improves cell type identification when combined with clustering algorithms, and has higher performance than other methods. Therefore, scFED offers certain benefits in scRNA-seq data gene selection.
What problem does this paper attempt to address?