Efficiently Mining Frequent Closed Partial Orders

J Pei,J Liu,HX Wang,K Wang,PS Yu,JY Wang
DOI: https://doi.org/10.1109/icdm.2005.57
2005-01-01
Abstract:Example 1 (Motivation) Suppose MapleBank in Canada wants to investigate whether there is some orders which customers often follow to open their accounts. A database DB in Table 1 about four customers’ sequences of opening accounts in MapleBank is analyzed. Given a support threshold min sup, a sequential pattern is a sequences which appears as subsequences of at least min sup sequences. For example, let min sup = 3. The following four sequences are sequential patterns since they are subsequences of three sequences, 1, 2 and4, in DB. CHK → MMK → MORT→ RESP; CHK → MMK → MORT→ BROK; CHK → RRSP→ MORT→ RESP; CHK → RRSP→ MORT→ BROK The sequential patterns capture the frequent account opening patterns shared by customers. However, the four sequential patterns cannot completely capture the ordering shared by customers 1, 2 and4. It is easy to see that a partial orderR as shown in Figure 1 is shared by the three account opening sequences. The partial order R summarizes the four sequential patterns – the four sequential patterns are paths in partial order R. It also provides more information about the ordering than the sequential patterns.
What problem does this paper attempt to address?