Mining gene expression data using a novel approach based on hidden Markov models.

Xinglai Ji,Jesse Li-Ling,Zhirong Sun
DOI: https://doi.org/10.1016/S0014-5793(03)00363-6
2003-01-01
FEBS Letters
Abstract:In this work we have developed a new framework for microarray gene expression data analysis. This framework is based on hidden Markov models. We have benchmarked the performance of this probability model-based clustering algorithm on several gene expression datasets for which external evaluation criteria were available. The results showed that this approach could produce clusters of quality comparable to two prevalent clustering algorithms, but with the major advantage of determining the number of clusters. We have also applied this algorithm to analyze published data of yeast cell cycle gene expression and found it able to successfully dig out biologically meaningful gene groups. In addition, this algorithm can also find correlation between different functional groups and distinguish between function genes and regulation genes, which is helpful to construct a network describing particular biological associations. Currently, this method is limited to time series data. Supplementary materials are available at http://www.bioinfo.tsinghua.edu.cn/∼rich/hmmgep_supp/.
What problem does this paper attempt to address?