Abstract:Abstract Motivation Interactions among such cis-regulatory elements as enhancers and promoters are main driving forces shaping context-specific chromatin structure and gene expression. Although there have been computational methods for predicting gene expression from genomic and epigenomic information, most of them overlook long-range enhancer-promoter interactions, due to the difficulty in precisely linking regulatory enhancers to target genes. Recently, a novel high-throughput experimental approach named HiChIP has been developed and generating comprehensive data on high-resolution interactions between promoters and distal enhancers. On the other hand, plenty of studies have suggested that deep learning achieves state-of-the-art performance in epigenomic signal prediction, and thus promoting the understanding of regulatory elements. In consideration of these two factors, we integrate proximal promoter sequences and HiChIP distal enhancer-promoter interactions to accurately model gene expression. Results We propose DeepExpression, a densely connected convolutional neural network to predict gene expression using both promoter sequences and enhancer-promoter interactions. We demonstrate that our model consistently outperforms baseline methods not only in the classification of binary gene expression status but also in the regression of continuous gene expression levels, in both cross-validation experiments and cross-cell lines predictions. We show that sequential promoter information is more informative than experimental enhancer information while enhancer-promoter interactions are most beneficial from those within ±100 kbp around the TSS of a gene. We finally visualize motifs in both promoter and enhancer regions and show the match of identified sequence signatures and known motifs. We expect to see a wide spectrum of applications using HiChIP data in deciphering the mechanism of gene regulation. Availability DeepExpression is freely available at https://github.com/wanwenzeng/DeepExpression . Contact ruijiang@tsinghua.edu.cn , ywang@amss.ac.cn Supplementary information Supplementary data are available at Bioinformatics online.

Integrative prediction of gene expression with chromatin accessibility and conformation data

Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network

Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction

Quantitative prediction of enhancer–promoter interactions

EPInformer: A Scalable Deep Learning Framework for Gene Expression Prediction by Integrating Promoter-enhancer Sequences with Multimodal Epigenomic Data

Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model

Exploiting sequence-based features for predicting enhancer–promoter interactions

Enhanced Performance of Gene Expression Predictive Models with Protein-Mediated Spatial Chromatin Interactions

Supervised enhancer prediction with epigenetic pattern recognition and targeted validation

Effective gene expression prediction from sequence by integrating long-range interactions

Supervised learning of enhancer-promoter specificity based on genome-wide perturbation studies highlights areas for improvement in learning

Predicting Transcription Factor Binding Sites with Deep Learning

Genome-wide in silico prediction of gene expression

Predicting enhancer-promoter interaction based on epigenomic signals

DeepEnhancer: Predicting Enhancers by Convolutional Neural Networks.

A New Method for Enhancer Prediction Based on Deep Belief Network

Predicting Enhancers with Deep Convolutional Neural Networks

Computational methods for the prediction of chromatin interaction and organization using sequence and epigenomic profiles

Prediction of enhancer-promoter interactions via natural language processing

Transfer learning and DNA language models enhance transcription factor binding predictions

Predicting enhancer-promoter interactions by deep learning and matching heuristic