A Limited Lattice Structure for Incremental Association Mining.
Yi Zhao,Jianliang Shi,Pengfei Shi
DOI: https://doi.org/10.1007/3-540-44533-1_89
2000-01-01
Abstract:An association rule typically strives for discovering a dependency among attributes with respect to the externally defined parameters like support threshold and confidence threshold. As an important database discovery method, the kernel of association rule mining is the acquisition of large itemsets. It is an important field of data mining to represent the support and confidence of items that are purchased together in supermarket domain. In this paper, a novel limited concept lattice is first proposed for the transaction data itemsets modeling. Concept lattice is a form of a concept hierarchy in which each node represents a subset of objects (extent) with their common properties (intent). The Hasse diagram of the lattice represents a generalization / specialization relationship between the concepts. Therefore, the lattice and Hasse diagram corresponding to a set of objects described by some properties can be used as an effective tool for symbolic data analysis and knowledge acquisition. Based on this lattice structure, an algorithm, LCLL, is presented to incrementally generate large itemsets visually. The algorithm works by means of attaching frequency information to each lattice node, the corresponding support measure can be obtained with the limited lattice. Besides, the edges in the Hasse diagram of the new lattice must be modified: the generator of a new node is always its child, and original parent of the generator is updated. When a node is deleted till the frequency value turns to zero, the node and the edges between its parents and children are not deleted, but tagged. The key point lies in adding edges when searching for the new node's parents, the large itemsets can be obtained by judging whether the cardinal and frequency value of the node exceeds the threshold or not. And accordingly, association rules can be identified. The approach is especially efficient when the database is dynamically updated (insertion, deletion or simultaneous insertion and deletion). Compared with K. Hu's approach [5], our algorithm generates all the association rules with much less time complexity. The time complexity of the proposed algorithm has a relationship of inverse proportion with the cardinal of the transactions, which means the applicability of the approach to the supermarket.