S2L-PSIBLAST: a Supervised Two-Layer Search Framework Based on PSI-BLAST for Protein Remote Homology Detection.

Xiaopeng Jin,Qing Liao,Bin Liu
DOI: https://doi.org/10.1093/bioinformatics/btab472
IF: 5.8
2021-01-01
Bioinformatics
Abstract:MOTIVATION:Protein remote homology detection is a challenging task for the studies of protein evolutionary relationships. PSI-BLAST is an important and fundamental search method for detecting homology proteins. Although many improved versions of PSI-BLAST have been proposed, their performance is limited by the search processes of PSI-BLAST.RESULTS:For further improving the performance of PSI-BLAST for protein remote homology detection, a supervised two-layer search framework based on PSI-BLAST (S2L-PSIBLAST) is proposed. S2L-PSIBLAST consists of a two-level search: the first-level search provides high-quality search results by using SMI-BLAST framework and double-link strategy to filter the non-homology protein sequences, the second-level search detects more homology proteins by profile-link similarity, and more accurate ranking lists for those detected protein sequences are obtained by learning to rank strategy. Experimental results on the updated version of Structural Classification of Proteins-extended benchmark dataset show that S2L-PSIBLAST not only obviously improves the performance of PSI-BLAST, but also achieves better performance on two improved versions of PSI-BLAST: DELTA-BLAST and PSI-BLASTexB.AVAILABILITY AND IMPLEMENTATION:http://bliulab.net/S2L-PSIBLAST.SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
What problem does this paper attempt to address?