Detecting Boolean Asymmetric Relationships with a Loop Counting Technique and its Implications for Analyzing Heterogeneity within Gene Expression Datasets

Haosheng Zhou,Wei Lin,Sergio R. Labra,Stuart A. Lipton,Jeremy A. Elman,Nicholas J. Schork,Aaditya V. Rangan
DOI: https://doi.org/10.1101/2022.08.04.502792
2024-04-20
Abstract:Many traditional methods for analyzing gene-gene relationships focus on positive and negative correlations, both of which are a kind of ‘symmetric’ relationship. Biclustering is one such technique that typically searches for subsets of genes exhibiting correlated expression among a subset of samples. However, genes can also exhibit ‘asymmetric’ relationships, such as ‘if-then’ relationships used in boolean circuits. In this paper we develop a very general method that can be used to detect biclusters within gene-expression data that involve subsets of genes which are enriched for these ‘boolean-asymmetric’ relationships (BARs). These BAR-biclusters can correspond to heterogeneity that is driven by asymmetric gene-gene interactions, e.g., reflecting regulatory effects of one gene on another, rather than more standard symmetric interactions. Unlike typical approaches that search for BARs across the entire population, BAR-biclusters can detect asymmetric interactions that only occur among a subset of samples. We apply our method to a single-cell RNA-sequencing data-set, demonstrating that the statistically-significant BAR-biclusters indeed contain additional information not present within the more traditional ‘boolean-symmetric’-biclusters. For example, the BAR-biclusters involve different subsets of cells, and highlight different gene-pathways within the data-set. Moreover, by combining the boolean-asymmetric- and boolean-symmetric-signals, one can build linear classifiers which outperform those built using only traditional boolean-symmetric signals.
Bioinformatics
What problem does this paper attempt to address?