Mining Order-Preserving Submatrices Based On Frequent Sequential Pattern Mining

Yun Xue,Yuting Li,Weijun Deng,Jiejin Li,Jianxiong Tang,Zhengling Liao,Tiechen Li
DOI: https://doi.org/10.1007/978-3-319-06269-3_20
2014-01-01
Abstract:Order-Preserving Submatrices (OPSMs) have been widely accepted as a pattern-based biclustering and used in gene expression data analysis. The OPSM problem aims at finding the groups of genes that exhibit similar rises and falls under some certain conditions. However, most methods are heuristic algorithms which are unable to reveal OPSMs entirely. In this paper, we proposed an exact method to discover all OPSMs based on frequent sequential pattern mining. Firstly, an algorithm is adjusted to disclose all common subsequences (ACS) between every two sequences. Then an improved data structure for prefix tree was used to store and traverse all common subsequences, and Apriori Principle was employed to mine the frequent sequential pattern efficiently. Finally, the experiments were implemented on a real data set and GO analysis was applied to identify whether the patterns discovered were biological significant. The results demonstrate the effectiveness and the efficiency of this method.
What problem does this paper attempt to address?