ProteinSequence Classification UsingExtreme Learning Machine

Dianhui Wang,Guang-Bin Huang
2005-01-01
Abstract:Traditionally, twoprotein sequences areclassified into thesameclass ifthey havehighhomology intermsoffeature patterns extracted through sequence alignment algorithms. These algorithms compare anunseenprotein sequence withallthe identified protein sequences andreturned thehigher scored protein sequences. Asthesizes oftheprotein sequence databases areverylarge, itisa verytimeconsuming jobtoperform exhaustive comparison ofexisting protein sequence. Therefore, there isa needtobuild animproved classification system for effectively identifying protein sequences. Inthis paper, arecently developed machine learning algorithm referred toastheExtreme Learning Machine(ELM)isusedtoclassify protein sequences withtenclasses ofsuper-families downloaded froma public domaindatabase. A comparative study onsystem performance isconducted between ELM andthemainconventional neural network classifier -Backpropagation Neural Networks. Results showthat ELM needs uptofour orders ofmagnitude less training timecompared toBPNetwork. Theclassification accuracy of ELM isalsohigher thanthat ofBPnetwork. Forgiven network architecture, ELM doesnothaveanycontrol parameters (i.e, stopping criteria, learning rate, learning epoches, etc) tobe manually tunedandcanbeimplemented easily.
What problem does this paper attempt to address?