Exploiting sequence-based features for predicting enhancer–promoter interactions

Yang Yang,Ruochi Zhang,Shashank Singh,Jian Ma
DOI: https://doi.org/10.1093/bioinformatics/btx257
IF: 5.8
2017-07-12
Bioinformatics
Abstract:MOTIVATION: A large number of distal enhancers and proximal promoters form enhancer-promoter interactions to regulate target genes in the human genome. Although recent high-throughput genome-wide mapping approaches have allowed us to more comprehensively recognize potential enhancer-promoter interactions, it is still largely unknown whether sequence-based features alone are sufficient to predict such interactions.RESULTS: Here, we develop a new computational method (named PEP) to predict enhancer-promoter interactions based on sequence-based features only, when the locations of putative enhancers and promoters in a particular cell type are given. The two modules in PEP (PEP-Motif and PEP-Word) use different but complementary feature extraction strategies to exploit sequence-based information. The results across six different cell types demonstrate that our method is effective in predicting enhancer-promoter interactions as compared to the state-of-the-art methods that use functional genomic signals. Our work demonstrates that sequence-based features alone can reliably predict enhancer-promoter interactions genome-wide, which could potentially facilitate the discovery of important sequence determinants for long-range gene regulation.AVAILABILITY AND IMPLEMENTATION: The source code of PEP is available at: https://github.com/ma-compbio/PEP .CONTACT: jianma@cs.cmu.edu.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?