Learning interpretable latent autoencoder representations with annotations of feature sets

Sergei Rybakov,Mohammad Lotfollahi,Fabian J. Theis,F. Alexander Wolf
DOI: https://doi.org/10.1101/2020.12.02.401182
2020-01-01
Abstract:Existing methods for learning latent representations for single-cell RNA-seq data are based on autoencoders and factor models. However, representations learned by autoencoders are hard to interpret and representations learned by factor models have limited flexibility. Here, we introduce a framework for learning interpretable autoencoders based on regularized linear decoders. It decomposes variation into interpretable components using prior knowledge in the form of annotated feature sets obtained from public databases. Through this, it provides an alternative to enrichment techniques and factor models for the task of explaining observed variation with biological knowledge. Benchmarking our model on two single-cell RNA-seq datasets, we demonstrate how our model outperforms an existing factor model regarding scalability while maintaining interpretability. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?