Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity

Meng Huang,Xiucai Ye,Hongmin Li,Tetsuya Sakurai
DOI: https://doi.org/10.3389/fgene.2022.952649
IF: 3.7
2022-07-14
Frontiers in Genetics
Abstract:Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq data, termed as dropout events. These zero values complicate the analysis of cell patterns, which affects the high-precision analysis of intra-tumor heterogeneity. Recovering missing gene expression values is still a major obstacle in the scRNA-seq data analysis. In this study, taking the cell heterogeneity into consideration, we develop a novel method, called single cell Gauss–Newton Gene expression Imputation (scGNGI), to impute the scRNA-seq expression matrices by using a low-rank matrix completion. The obtained experimental results on the simulated datasets and real scRNA-seq datasets show that scGNGI can more effectively impute the missing values for scRNA-seq gene expression and improve the down-stream analysis compared to other state-of-the-art methods. Moreover, we show that the proposed method can better preserve gene expression variability among cells. Overall, this study helps explore the complex biological system and precision medicine in scRNA-seq data.
genetics & heredity
What problem does this paper attempt to address?