DeepSS: Exploring Splice Site Motif Through Convolutional Neural Network Directly from DNA Sequence

Xiuquan Du,Yu Yao,Yanyu Diao,Huaixu Zhu,Yanping Zhang,Shuo Li
DOI: https://doi.org/10.1109/access.2018.2848847
IF: 3.9
2018-01-01
IEEE Access
Abstract:Splice sites prediction and interpretation are crucial to the understanding of complicated mechanisms underlying gene transcriptional regulation. Although existing computational approaches can classify true/false splice sites, the performance mostly relies on a set of sequence- or structure-based features and model interpretability is relatively weak. In viewing of these challenges, we report a deep learning-based framework (DeepSS), which consists of DeepSS-C module to classify splice sites and DeepSS-M module to detect splice sites sequence pattern. Unlike previous feature construction and model training process, DeepSS-C module accomplishes feature learning during the whole model training. Compared with state-of-the-art algorithms, experimental results show that the DeepSS-C module yields more accurate performance on six publicly donor/acceptor splice sites data sets. In addition, the parameters of the trained DeepSS-M module are used for model interpretation and downstream analysis, including: 1) genome factors detection (the truly relevant motifs that induce the related biological process happen) via filters from deep learning perspective; 2) analyzing the ability of CNN filters on motifs detection; 3) co-analysis of filters and motifs on DNA sequence pattern. DeepSS is freely available at http://ailab.ahu.edu.cn:8087/DeepSS/index.html.
What problem does this paper attempt to address?