Shrinkage estimation of gene interaction networks in single-cell RNA sequencing data

Duong H.T. Vo,Thomas Thorne
DOI: https://doi.org/10.1101/2024.03.20.585951
2024-03-25
Abstract:Gene interaction networks are graphs in which nodes represent genes and edges represent functional interactions between them. These interactions can be at multiple levels, for instance, gene regulation, protein-protein interaction, or metabolic pathways. To analyse gene interaction networks at a large scale, gene co-expression network analysis is often applied on high-throughput gene expression data such as RNA sequencing data. With the advance in sequencing technology, expression of genes can be measured in individual cells. Single-cell RNA sequencing (scRNAseq) provides insights of cellular development, differentiation and characteristics at transcriptomic level. High sparsity and high-dimensional data structure pose challenges in scRNAseq data analysis. In this study, a sparse inverse covariance matrix estimation framework for scRNAseq data is developed to capture direct functional interactions between genes. Comparative analyses highlight high performance and fast computation of Stein-type shrinkage in high-dimensional data using simulated scRNAseq data. Data transformation approaches also show improvement in performance of shrinkage methods in non-Gaussian distributed data. Zero-inflated modelling of scRNAseq data based on a negative binomial distribution enhances shrinkage performance in zero-inflated data without interference on non zeroinflated count data. The optimal zero-inflated Stein-type shrinkage framework is applied on experimental scRNAseq data which demonstrates its potential to construct sparser gene interaction networks with higher precision.
Systems Biology
What problem does this paper attempt to address?