GEEES: inferring cell-specific gene-enhancer interactions from multi-modal single-cell data

Shuyang Chen,Sündüz Keleş
DOI: https://doi.org/10.1093/bioinformatics/btae638
IF: 5.8
2024-11-01
Bioinformatics
Abstract:Motivation: Gene-enhancer interactions are central to transcriptional regulation. Current multi-modal single-cell datasets that profile transcriptome and chromatin accessibility simultaneously in a single cell are yielding opportunities to infer gene-enhancer associations in a cell type specific manner. Computational efforts for such multi-modal single-cell datasets thus far focused on methods for identification and refinement of cell types and trajectory construction. While initial attempts for inferring gene-enhancer interactions have emerged, these have not been evaluated against benchmark datasets that materialized from bulk genomic experiments. Furthermore, existing approaches are limited to inferring gene-enhancer associations at the level of grouped cells as opposed to individual cells, thereby ignoring regulatory heterogeneity among the cells. Results: We present a new approach, GEEES for "Gene EnhancEr IntEractions from Multi-modal Single Cell Data," for inferring gene-enhancer associations at the single-cell level using multi-modal single-cell transcriptome and chromatin accessibility data. We evaluated GEEES alongside several multivariate regression-based alternatives we devised and state-of-the-art methods using a large number of benchmark datasets, providing a comprehensive assessment of current approaches. This analysis revealed significant discrepancies between gold-standard interactions and gene-enhancer associations derived from multi-modal single-cell data. Notably, incorporating gene-enhancer distance into the analysis markedly improved performance across all methods, positioning GEEES as a leading approach in this domain. While the overall improvement in performance metrics by GEEES is modest, it provides enhanced cell representation learning which can be leveraged for more effective downstream analysis. Furthermore, our review of existing experimentally driven benchmark datasets uncovers their limited concordance, underscoring the necessity for new high-throughput experiments to validate gene-enhancer interactions inferred from single-cell data. Availability and implementation: https://github.com/keleslab/GEEES.
What problem does this paper attempt to address?