Biologically Interpretable VAE with Supervision for Transcriptomics Data Under Ordinal Perturbations

Seyednami Niyakan,Byung-Jun Yoon,Xiaoning Qian,Xihaier Luo

DOI: https://doi.org/10.1101/2024.03.28.587231

2024-03-29

Abstract:Latent variable models such as the Variational Auto-Encoders (VAEs) have shown impressive performance for inferring expression patterns for cell subtyping and biomarker identification from transcriptomics data. However, the limited interpretability of their latent variables obscures deriving meaningful biological understanding of cellular responses to different external and internal perturbations. We here propose a novel deep learning framework, EXPORT ( lainable VAE for dinally perturbed ranscriptomics data), for analyzing ordinally perturbed transcriptomics data that can incorporate any biological pathway knowledge in the VAE latent space. With the corresponding pathway-informed decoder, the learned latent expression patterns can be explained as pathway-level responses to perturbations, offering direct interpretability with biological understanding. More importantly, we explicitly model the ordinal nature of many real-world perturbations into the EXPORT framework by training an auxiliary ordinal regressor neural network to capture corresponding expression changes in the VAE latent representations, for example under different dosage levels of radiation exposure. By incorporating ordinal constraints during the training of our proposed framework, we further enhance the model interpretability by guiding the VAE latent space to organize perturbation responses in a hierarchical manner. We demonstrate the utility of the inferred guided latent space for downstream tasks, such as identifying key regulatory pathways associated with specific perturbation changes by analyzing transcriptomics datasets on both bulk and single-cell data. Overall, we envision that our proposed approach can unravel unprecedented biological intricacies in cellular responses to various perturbations while bringing an additional layer of interpretability to biology-inspired deep learning models.

Bioinformatics

What problem does this paper attempt to address?

The paper aims to address the following issues: In transcriptomics data analysis, researchers typically focus on the response patterns of cells to ordinal perturbations (such as different dosage levels of drug screening and radiation exposure). Although existing Variational Autoencoders (VAEs) can reveal biological insights from large and heterogeneous perturbation-induced gene expression data, the interpretability of their latent variables is limited, making these models "black boxes." To overcome this limitation, this paper proposes a new deep learning framework—EXPORT (EXplainable VAE for Ordinally perturbed Transcriptomics data), for analyzing transcriptomics data under ordinal perturbations. Specifically, EXPORT guides the latent space of the VAE to organize perturbation responses in a hierarchical manner by training an auxiliary ordinal regression neural network and explicitly modeling the ordinal relationships in the training loss function. In this way, EXPORT not only enhances the interpretability of the model but also identifies key regulatory pathways associated with specific perturbation changes, making it suitable for handling both bulk and single-cell datasets. Overall, this method can reveal unprecedented biological complexity when analyzing cell responses under various perturbations, while adding a layer of interpretability to biologically inspired deep learning models.

Biologically Interpretable VAE with Supervision for Transcriptomics Data Under Ordinal Perturbations

VEGA is an interpretable generative model for inferring biological network activity in single-cell transcriptomics

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Prediction of context-specific regulatory programs and pathways using interpretable deep learning

Out-of-distribution Prediction with Disentangled Representations for Single-Cell RNA Sequencing Data

Conditional Out-of-distribution Generation for Unpaired Data Using Transfer VAE.

Variational autoencoders learn transferrable representations of metabolomics data

Learning interpretable latent autoencoder representations with annotations of feature sets

Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders

A Supervised Contrastive Framework for Learning Disentangled Representations of Cell Perturbation Data

Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations

CoupleVAE: coupled variational autoencoders for predicting perturbational single-cell RNA sequencing data

Principled feature attribution for unsupervised gene expression analysis

Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder

Multi-ContrastiveVAE disentangles perturbation effects in single cell images from optical pooled screens

scVAE: variational auto-encoders for single-cell gene expression data

Interpretable Sentence Representation with Variational Autoencoders and Attention

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility

Learning identifiable and interpretable latent models of high-dimensional neural activity using pi-VAE

VAPOR: Variational autoencoder with transport operators decouples co-occurring biological processes in development

Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses