Abstract:Discovering DNA regulatory sequence motifs and their relative positions is vital to understanding the mechanisms of gene expression regulation. Although deep convolutional neural networks (CNNs) have achieved great success in predicting cis-regulatory elements, the discovery of motifs and their combinatorial patterns from these CNN models has remained difficult. We show that the main difficulty is due to the problem of multifaceted neurons which respond to multiple types of sequence patterns. Since existing interpretation methods were mainly designed to visualize the class of sequences that can activate the neuron, the resulting visualization will correspond to a mixture of patterns. Such a mixture is usually difficult to interpret without resolving the mixed patterns. We propose the NeuronMotif algorithm to interpret such neurons. Given any convolutional neuron (CN) in the network, NeuronMotif first generates a large sample of sequences capable of activating the CN, which typically consists of a mixture of patterns. Then, the sequences are "demixed" in a layer-wise manner by backward clustering of the feature maps of the involved convolutional layers. NeuronMotif can output the sequence motifs, and the syntax rules governing their combinations are depicted by position weight matrices organized in tree structures. Compared to existing methods, the motifs found by NeuronMotif have more matches to known motifs in the JASPAR database. The higher-order patterns uncovered for deep CNs are supported by the literature and ATAC-seq footprinting. Overall, NeuronMotif enables the deciphering of cis-regulatory codes from deep CNs and enhances the utility of CNN in genome interpretation.

Motif Interactions Affect Post-Hoc Interpretability of Genomic Convolutional Neural Networks

A mechanistically interpretable neural network for regulatory genomics

Convolutional Motif Kernel Networks

Inherently interpretable position-aware convolutional motif kernel networks for biological sequencing data

Hypothesis-driven interpretable neural network for interactions between genes

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability

Incorporating Biological Knowledge with Factor Graph Neural Network for Interpretable Deep Learning

An Interpretation of Convolutional Neural Networks for Motif Finding from the View of Probability

Motif-induced Subgraph Generative Learning for Explainable Neurological Disorder Detection

NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks

Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks

Designing interpretable deep learning applications for functional genomics: a quantitative analysis

Variation in the control region sequence of the sheep mitochondrial genome.

Studying Limits of Explainability by Integrated Gradients for Gene Expression Models

Motifs emerge from function in model gene regulatory networks

NeuronMotif: Deciphering Cis-Regulatory Codes by Layer-Wise Demixing of Deep Neural Networks.

Biophysical models of cis-regulation as interpretable neural networks

Detecting Genetic Interactions with Visible Neural Networks

Right for the Wrong Reason: Can Interpretable ML Techniques Detect Spurious Correlations?

MoCoLo: a testing framework for motif co-localization

Interpreting cis -regulatory mechanisms from genomic deep neural networks using surrogate models