DeepRegFinder: deep learning-based regulatory elements finder

Aarthi Ramakrishnan,George Wangensteen,Sarah Kim,Eric J Nestler,Li Shen
DOI: https://doi.org/10.1093/bioadv/vbae007
2024-01-01
Bioinformatics Advances
Abstract:Abstract Summary Enhancers and promoters are important classes of DNA regulatory elements (DREs) that govern gene expression. Identifying them at a genomic scale is a critical task in bioinformatics. The DREs often exhibit unique histone mark binding patterns, which can be captured by high-throughput ChIP-seq experiments. To account for the variations and noises among the binding sites, machine learning models are trained on known enhancer/promoter sites using histone mark ChIP-seq data and predict enhancers/promoters at other genomic regions. To this end, we have developed a highly customizable program named DeepRegFinder, which automates the entire process of data processing, model training, and prediction. We have employed convolutional and recurrent neural networks for model training and prediction. DeepRegFinder further categorizes enhancers and promoters into active and poised states, making it a unique and valuable feature for researchers. Our method demonstrates improved precision and recall in comparison to existing algorithms for enhancer prediction across multiple cell types. Moreover, our pipeline is modular and eliminates the tedious steps involved in preprocessing, making it easier for users to apply on their data quickly. Availability and implementation https://github.com/shenlab-sinai/DeepRegFinder
What problem does this paper attempt to address?