RingTK: A Ring, Parallel and High Performance Top-K Sorter on FPGA

Huawen Liang,Qizhe Wu,Wei Yuan,Teng Tian,Xi Jin
DOI: https://doi.org/10.1109/fccm60383.2024.00043
2024-01-01
Abstract:Getting the K largest/smallest elements from $N$ inputs is one of the essential operations in many applications. In this article, we propose RingTK, a ring, parallel, and high performance Top-K sorter implemented on FPGA. We use a priority queue as the basic processing unit and design a Top-K sorter based on a ring topology with a global maximum module(GMM) to obtain good scalability and high performance. Based on the ring topology, we design Ring Multiplexers (RMUX) and modify the GMM to enable RingTK to efficiently handle different K-sizes and concurrent tasks. We design a encoder to flexibly deal with different data formats and max/min Top-K tasks. Finally, we implement the proposed architecture on the Xilinx XCVU37P FPGA. The results show that the proposed architecture has good scalability, and the throughput and parallelism are approximately linear. We can achieve a throughput of 35.38GB/s with 32 PQs that exceed existing literature with K=8160.
What problem does this paper attempt to address?