Iacet-Sumo: Identification of Lysine Acetylation and Sumoylation Sites in Proteins by Multi-Class Transformation Methods.

Yingxi Yang,Hui Wang,Jun Ding,Yan Xu
DOI: https://doi.org/10.1016/j.compbiomed.2018.07.006
IF: 7.7
2018-01-01
Computers in Biology and Medicine
Abstract:MOTIVATION:Posttranslational modification (PTM) is a biological mechanism involved in the enzymatic modification of proteins after translation by ribosomes. Two or more modifications occurring at one residue can be transformed into a multi-label system. Two or more simultaneous modifications on a residue is more common than single PTMs. Lysine residues in proteins can be subjected to a variety of PTMs, such as ubiquitination, acetylation, sumoylation, methylation, and succinylation. Identification of uncharacterized sequences in proteins is a highly significant and state-of-the-art issue. Notably, in order to provide a method of processing multi-label sequences of lysine residues, it is highly desirable to develop computational methods to predict lysine acetylation and sumoylation modifications.RESULTS:In this paper, we first launched an integrated approach, known as the five-step prediction method (FSPM), to solve the problem effectively by (1) using one-sided selection (OSS) to deal with imbalanced data, (2) extracting binary features from protein sequences, (3) incorporating binary relevance, classifier chains and multi-class transformation methods to simplify multi-label problems, (4) constructing different classifiers, and (5) implementing cross-validation and evaluating these classifiers. In 10-fold cross-validation, FSPM achieved an accuracy of 61.49% and an absolute-true rate of 60.17%. The results showed that FSPM is accurate and could be used as a powerful engine in multi-label systems. We also conducted a variety of statistical analyses of the predicted results to discuss the biological functions of lysine acetylation and sumoylation.
What problem does this paper attempt to address?