Arvon: A Heterogeneous System-in-Package Integrating FPGA and DSP Chiplets for Versatile Workload Acceleration
Junkang Zhu,Wei Tang,Cheng-Hsun Lu,Zhengya Zhang,T. Hoang,Sung-gun Cho,Wei Qiang Zhu,M. Flanigan,Ching-Chi Chang,S. Kale,Thungoc Tran,Ramya Yarlagadda,Tianyu Wei,Yaoyu Tao,Naomi Kavi Motwani,Sergey Y. Shumarayev,Allen Chan,Jacob Botimer,Mani Yalamanchi
DOI: https://doi.org/10.1109/JSSC.2023.3343457
IF: 5.4
2024-04-01
IEEE Journal of Solid-State Circuits
Abstract:Integrating heterogeneous chiplets in a package presents a promising and cost-effective approach to constructing scalable and flexible systems for accelerating a wide range of workloads. We introduce Arvon that integrates a 14-nm FPGA chiplet with two efficient and densely packed 22-nm DSP chiplets using embedded multidie interconnect bridges (EMIBs). The chiplets are interconnected via a 1.536-Tb/s advanced interface bus (AIB) 1.0 interface and a 7.68-Tb/s AIB 2.0 interface. Arvon is programmable, supporting various workloads from neural network (NN) to communication signal processing. Each DSP chiplet delivers a peak performance of 4.14 TFLOPS in half-precision floating-point while maintaining a power efficiency of 1.8 TFLOPS/W. A compilation procedure is developed to map workloads across the FPGA and DSPs to optimize performance and utilization. Our AIB 2.0 interface implementation using 36- $\mu \text{m}$ -pitch microbumps achieves a data transfer rate of 4 Gb/s/pin, with an energy efficiency of 0.10–0.46 pJ/b including the adapter. The bandwidth density reaches 1.024 Tb/s/mm of shoreline and 1.705 Tb/s/mm 2 of area.
Engineering,Computer Science