Prediction of Transcription Factor Binding Sites with an Attention Augmented Convolutional Neural Network

Fang Jing,Shao-Wu Zhang,Shihua Zhang
DOI: https://doi.org/10.1109/tcbb.2021.3126623
2022-01-01
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:Identification of transcription factor binding sites (TFBSs) is essential for revealing the rules of protein-DNA binding. Although some computational methods have been presented to predict TFBSs using epigenomic and sequence features, most of them ignore the common features among cross-cell types. It is still unclear to what extent the common features could help for this task. To this end, we proposed a new method (named Attention-augmented Convolutional Neural Network, or ACNN) to predict TFBSs. ACNN uses attention-augmented convolutional layers to capture global and local contexts in DNA sequences and employs the convolutional layers to capture features of histone modification markers. In addition, ACNN adopts the private and shared convolutional neural network (CNN) modules to learn specific and common features, respectively. To encourage the shared CNN module to learn the common features, adversarial training is applied in ACNN. The results on 253 ChIP-seq datasets show that ACNN outperforms other existing methods. The attention-augmented convolutional layers and adversarial training mechanism in ACNN can effectively improve the prediction performance. Moreover, in the case of limited labeled data, ACNN also performs better than a baseline method. We further visualize the convolution kernels as motifs to explain the interpretability of ACNN.
What problem does this paper attempt to address?