InterProScan 5: genome-scale protein function classification

Philip Jones,David Binns,Hsin-Yu Chang,Matthew Fraser,Weizhong Li,Craig McAnulla,Hamish McWilliam,John Maslen,Alex Mitchell,Gift Nuka,Sebastien Pesseat,Antony F. Quinn,Amaia Sangrador-Vegas,Maxim Scheremetjew,Siew-Yit Yong,Rodrigo Lopez,Sarah Hunter,P. Jones,D. Binns,H.-Y. Chang,M. Fraser,W. Li,C. McAnulla,H. McWilliam,J. Maslen,A. Mitchell,G. Nuka,S. Pesseat,A. F. Quinn,A. Sangrador-Vegas,M. Scheremetjew,S.-Y. Yong,R. Lopez,S. Hunter
DOI: https://doi.org/10.1093/bioinformatics/btu031
IF: 5.8
2014-01-21
Bioinformatics
Abstract:MOTIVATION: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?