Machine Learning Platform for Protein Function Sites Prediction

Xiao Sun
2010-01-01
Abstract:Research of protein function is the base of life mystery,and machine learning technology is widely used in this field.This paper constructs a general platform using support vector machine(SVM) to predict protein function sites.Firstly,the platform extracts non-homologous protein sequences,and codes characteristics which include basic information,physical and chemical characteristics,structure information,sequence conservation characteristics.Then uses SVM to train the coded dataset,and get sensitivity,specificity,Matthew correlation coefficients,accuracy and ROC curve.Finally,get the best model and use it to predict the unknown protein function sites.Moreover the platform can be used to analyze disease and the related SNP,predict protein domain,biomolecular interaction and so on.
What problem does this paper attempt to address?