Imputing dropouts for single-cell RNA sequencing based on multi-objective optimization

Ke Jin,Bo Li,Hong Yan,Xiao-Fei Zhang
DOI: https://doi.org/10.1093/bioinformatics/btac300
IF: 5.8
2022-04-29
Bioinformatics
Abstract:Abstract Motivation Single-cell RNA sequencing (scRNA-seq) technologies have been testified revolutionary for their promotion on the profiling of single-cell transcriptomes at single-cell resolution. Excess zeros due to various technical noises, called dropouts, will mislead downstream analyses. Therefore, it is crucial to have accurate imputation methods to address the dropout problem. Results In this paper, we develop a new dropout imputation method for scRNA-seq data based on multi-objective optimization. Our method is different from existing ones, which assume that the underlying data has a preconceived structure and impute the dropouts according to the information learned from such structure. We assume that the data combines three types of latent structures, including the horizontal structure (genes are similar to each other), the vertical structure (cells are similar to each other), and the low-rank structure. The combination weights and latent structures are learned using multi-objective optimization. And, the weighted average of the observed data and the imputation results learned from the three types of structures are considered as the final result. Comprehensive downstream experiments show the superiority of our method in terms of recovery of true gene expression profiles, differential expression analysis, cell clustering and cell trajectory inference. Availability The R package is available at https://github.com/Zhangxf-ccnu/scMOO and https://zenodo.org/record/5785195. The codes to reproduce the downstream analyses in this paper can be found at https://github.com/Zhangxf-ccnu/scMOO_experiments_codes and https://zenodo.org/record/5786211. Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?