Hap: A Spatial-von Neumann Heterogeneous Automata Processor with Optimized Resource and IO Overhead on FPGA
Xuan Wang,Lei Gong,Jing Cao,Wenqi Lou,Weiya Wang,Chao Wang,Xuehai Zhou
DOI: https://doi.org/10.1145/3543622.3573190
2023-01-01
Abstract:Regular expression (REGEX) matching tasks drive much research on automata processors (AP). Among them, the von Neumann AP can efficiently utilize on-chip memory to process the Deterministic Finite Automata (DFA), but it is limited to small REGEX sets due to the DFA's state explosion problem. For large REGEX sets, the spatial AP based on Nondeterministic Finite Automaton (NFA) is the mainstream choice. However, there are two problems with previous FPGA-based spatial AP. First, it cannot obtain a balanced FPGA resource usage (LUT and BRAM), which easily leads to resource shortage. Second, to compress the report output data of large REGEX sets, it uses dynamic report compression, which not only consumes a lot of FPGA resources but also limits performance. This paper optimizes the resource and IO overhead of spatial AP. First, noticing the resource optimization ability of the von Neumann AP, we propose the flex-hybrid-FA algorithm to generate small hybrid-FAs (an NFA/DFA hybrid model) and further propose the Spatial-von Neumann Heterogeneous AP to deploy hybrid-FA. Under the constraints of the flex-hybrid-FA algorithm, we can obtain balanced and efficient FPGA resource usage. Second, we propose High-Efficient Automata Report Compression (HEARC) with a compression ratio of up to 5.5-47.6x, which can thoroughly release the performance from IO congestion, and consumes less FPGA resource compared to previous dynamic report compression approaches. As far as we know, this is the first work to deploy large REGEX sets on low-cost small-scale FPGAs (e.g. Xilinx XCZU3CG). The experimental results show that compared to the previous FPGA-based APs, we save 4.0-6.6x power consumption and improve 2.7-5.9x energy efficiency.