A closed sequential pattern mining algorithm for discovery of the software bugs feature
Jiadong Ren,Yujie Xie,Aiguo Zhang,Changzhen Hu,Yunyun Chen
2011-01-01
Journal of Computational Information Systems
Abstract:In order to discovery the feature of software bugs, which can help us to improve the safety performance of software, this paper proposes a novel algorithm MSPT(mining closed sequential pattern based on projection tree) and an update algorithm UMSPT. In MSPT, a positional information table of the semi-frequent and frequent items is constructed firstly, then the semi-frequent and frequent 2-patterns are searched according to the position information of the items. Secondly, we built a BStree(bug sequences tree) which contains all of the semi-frequent and frequent items. In BStree, all of the S-step extension items are linked with solid line, and the I-step extension items are linked with broken line. The same items which appear firstly in each sequence corresponding to the tree are linked to a header table. Then we build a PBStree (bug sequences projection tree) for the existing 2-patterns, the extension sequences can be obtained through mining the PBStree. At the same time, we check the inclusion relation of the existing patterns so as to obtain the SCS(semi-closed sequential pattern) and CS(closed sequential pattern). This process is recursive until there is no items whose support count is greater than μ*minsup. In UMSPT, when the new bug sequences are found, the new sequences without infrequent items are inserted into the BStree, then we build the PBStree for the inserted sequence so as to find the ISCS(semi-closed sequential pattern in incremental sequences) and ICS(closed sequential pattern in incremental sequences) in this sequence. At last, all of the final SCS and CS can be obtained through checking the inclusion relation of the existing ISCS, ICS, SCS and CS. The experiment result shows algorithm MSPT and UMSPT have better time efficiency. Copyright © 2011 Binary Information Press.