How to Accelerate FPGA Application in an Asynchronous Way?
Anping He,Jinlin Zhang,Lvying Yu,Pengfei Li,Lian Li
DOI: https://doi.org/10.1145/3289602.3293953
2019-01-01
Abstract:FPGA with massive customizable parallel computation capacity, is potentially good for fast time-to-market applications. However, its complex placing and routing lead to a relatively large latency and low frequency. Besides, the clock problems might make a design hard and slow, especially the one with complex control or variant computations. All of those defects seem to be from the essence of the synchronous design methodology and there does not exist an easy way to solve by clocks. Although the FPGA vendors do not supply an asynchronous design routine, flow or tool, it is still possible to implement a clockless design with a concrete FPGA chip and then harness lots of benefits that synchronous one misses, which is shown in this paper. We adopt link-joint as the asynchronous communication mechanism that discards clock limitation, but equips high throughput due to the fast handshake among neighbor clicks. The simplest link-joint circuit is click that conforms to Bundled Bound Data (BBD) protocol for local communication. Multiple clicks can be constructed and trimmed to types of micro-pipeline structures, feasibly and flexibly. With above considerations, we propose an innovative asynchronous design method for Xilinx FPGA applications, as well as the asynchronous control framework by dedicated micro-pipeline structures. Furthermore, we introduce delay maching technologies as well as whole design flow and tool-chain. All of these supply an applicable way of accelerating an asynchronous design for a FPGA. The case-studies show that communication between neighbor clicks is less than 1.1ns and the asynchronous method accelerates FPGA latency extremely.