Approximate Frequent Itemset Mining for Streaming Data on FPGA

Yubin Li,Yuliang Sun,Guohao Dai,Qiang Xu,Yu Wang,Huazhong Yang
DOI: https://doi.org/10.1109/fpl.2016.7577331
2016-01-01
Abstract:Frequent Itemset Mining (FIM) is designed to find frequently occurring itemsets among a series of transactions. It is extremely memory and time expensive. Frequent Itemset Mining from a Data Stream (FIM-DS) is even more challenging since storing the infinite data to memory is infeasible. In recent years, researchers have proposed various approximation algorithms for FIM-DS. However, the computation complexity is still high, and these methods are difficult to be accelerated using hardware accelerators. In this paper, we propose a Space-Saving based approximate algorithm for FIM-DS. It avoids exponential candidates generation and comparisons. We realize a hardware accelerator design and implement it on an FPGA platform. Experimental results show that our algorithm in software implementation achieves up to 8.4× speedup for transactions with small item database, and our hardware accelerator achieves up to 50,000× speedup for transactions with small number of items, and 5.3× speedup for transactions with extremely large number of items.
What problem does this paper attempt to address?