A Unified Deep Biological Sequence Representation Learning with Pretrained Encoder Decoder Model

Hai-Cheng Yi,Zhu-Hong You,Xiao-Rui Su,De-Shuang Huang,Zhen-Hao Guo
DOI: https://doi.org/10.1007/978-3-030-60802-6_30
2020-01-01
Abstract:Machine learning methods are increasingly being applied to model and predict biomolecular interactions, while efficient feature representation plays a vital role. To this end, a unified biological sequence deep representation learning framework BioSeq2vec is proposed to extract discriminative features of any type of biological sequence. For arbitrary-length sequence input, the BioSeq2vec produces fixed-length efficient feature representation, which can be applied to various learning models. The performance of BioSeq2vec is evaluated on lncRNA-protein interaction prediction tasks. Experimental results reveal the superior performance of BioSeq2vec in biological sequence feature representation and broad prospects in various genome informatics and computational biology studies.
What problem does this paper attempt to address?