Gene clustering with hidden Markov model optimized by PSO algorithm

Mohammad Soruri,Javad Sadri,S. Hamid Zahiri
DOI: https://doi.org/10.1007/s10044-018-0680-9
IF: 2.307
2018-03-01
Pattern Analysis and Applications
Abstract:Gene clustering is one of the most important problems in bioinformatics. In the sequential data clustering, hidden Markov models (HMMs) have been widely used to find similarity between sequences, due to their capability of handling sequence patterns with various lengths. In this paper, a novel gene clustering scheme based on HMMs optimized by particle swarm optimization algorithm is introduced. In this approach, each gene sequence is described by a specific HMM, and then for each model, its probability to generate individual sequence is evaluated. A hierarchical clustering algorithm based on a new definition of a distance measure has been applied to find the best clusters. Experiments carried out on lung cancer-related genes dataset show that the proposed approach can be successfully utilized for gene clustering.
computer science, artificial intelligence
What problem does this paper attempt to address?