SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization

Boby Siswanto,Haryono Soeparno,Nesti Fronika Sianipar,Widodo Budiharto
DOI: https://doi.org/10.1109/access.2024.3361667
IF: 3.9
2024-02-16
IEEE Access
Abstract:An essential element of association rules is the strong confidence values that depend on the support value threshold, which determines the optimum number of datasets. The existing method for determining the support value threshold is carried out manually by trial and error; the user determines a support value such as 10%, 30%, or 60% according to their instincts. If the support value threshold is inappropriate, it produces useless frequent patterns, overburdens computer resources, and wastes time. The formula for predicting the maximum count of frequent patterns was 2n – 1, where is the number of distinct items in the dataset. This paper proposes a new SDFP-growth algorithm that does not require manual determination of the support threshold value. The SDFP-growth algorithm will perform dimensionality reduction on the original dataset that will generate level 1 and level 2 smaller datasets, thus automatically producing a dataset with an optimum amount of data with a minimum support value threshold. The proposed formula for predicting the maximum number of frequent patterns will become 2 - 1, which is will always be smaller than . Experiments were performed on five various datasets, which reduced the number of data dimensions by more than 3% on the Level 1 dataset and more than 69% on the Level 2 dataset by maintaining the confidence value of the strong rules. In the execution time evaluated, we found an optimization of more than 2% on the level 1 dataset and more than 94% on the level 2 dataset.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?