Computational detection of allergenic proteins attains a new level of accuracy with in silico variable-length peptide extraction and machine learning

D. Soeria-Atmadja,T. Lundell,M. G. Gustafsson,U. Hammerling
DOI: https://doi.org/10.1093/nar/gkl467
IF: 14.9
2006-07-28
Nucleic Acids Research
Abstract:The placing of novel or new-in-the-context proteins on the market, appearing in genetically modified foods, certain bio-pharmaceuticals and some household products leads to human exposure to proteins that may elicit allergic responses. Accurate methods to detect allergens are therefore necessary to ensure consumer/patient safety. We demonstrate that it is possible to reach a new level of accuracy in computational detection of allergenic proteins by presenting a novel detector, Detection based on Filtered Length-adjusted Allergen Peptides (DFLAP). The DFLAP algorithm extracts variable length allergen sequence fragments and employs modern machine learning techniques in the form of a support vector machine. In particular, this new detector shows hitherto unmatched specificity when challenged to the Swiss-Prot repository without appreciable loss of sensitivity. DFLAP is also the first reported detector that successfully discriminates between allergens and non-allergens occurring in protein families known to hold both categories. Allergenicity assessment for specific protein sequences of interest using DFLAP is possible via ulfh@slv.se.
biochemistry & molecular biology
What problem does this paper attempt to address?