Hardware Efficient and Low-Latency Ca-Scl Decoder Based on Distributed Sorting
Xiao Liang,Junmei Yang,Chuan Zhang,Wenqing Song,Xiaohu You
DOI: https://doi.org/10.1109/glocom.2016.7841865
2016-01-01
Abstract:For polar codes, cyclic redundancy check (CRC)aided successive cancellation list (CA-SCL) decoder has attracted increasing attention from both academia and industry. In this paper, a hardware efficient and low-latency CA-SCL polar decoder based on distributed sorting is first proposed. For path metric (PM) sorting of each level, a distributed sorting (DS) algorithm is proposed to reduce the comparison complexity from (L 2 ) to (L) (L denotes list size), together with the latency from kL 2 to kL (k is a coefficient independent of L). Employing folding technique, the N-bit folding polar decoder can be implemented based on the basic √N-bit polar decoder. In addition, pipelining technique is employed to refine the timing issue resulting from folding. The CRC is performed for 2L candidate paths serially to reduce hardware cost. According to demo of (1024, 512) code on Altera Stratix V FPGA, the proposed CA-SCL decoders with L = 2 and adjustable L = 2, 4 consume 9% and 50% board resources, respectively. Decoding latencies (in terms of clock cycles) are 2, 528 and 4, 064, respectively. For L = 2 and 4, we can achieve the frame error rate (FER) of 10 -2 at the signal noise ratio (SNR) of 2.36 dB and 2.06 dB, respectively. Compared with the floating point results, the performance degradation is negligible. Thus, the proposed design is suitable and adjustable for different real-life scenarios.