Wafer-Scale Computing: Advancements, Challenges, and Future Perspectives [Feature]
Yang Hu,Xinhan Lin,Huizheng Wang,Zhen He,Xingmao Yu,Jiahao Zhang,Qize Yang,Zheng Xu,Sihan Guan,Jiahao Fang,Haoran Shang,Xinru Tang,Xu Dai,Shaojun Wei,Shouyi Yin
DOI: https://doi.org/10.1109/mcas.2024.3349669
IF: 4.04
2024-01-01
IEEE Circuits and Systems Magazine
Abstract:Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computing power has failed to keep up. This has become a significant factor restricting the development of AI. The augmentation of hardware computing power is mainly propelled by the escalation of transistor density and chip area. However, the former is impeded by the termination of the Moore's Law and Dennard scaling, and the latter is significantly restricted by the challenge of disrupting the legacy fabrication equipment and process. In recent years, advanced packaging technologies that have gradually matured are increasingly used to implement bigger chips that integrate multiple chiplets, while still providing interconnections with chip-level density and bandwidth. This technique points out a new path of continuing the increase of computing power while leveraging the current fabrication process without significant disruption. Enabled by this technique, a chip can extend to a size of wafer-scale provisioning orders of magnitude more computing capabilities (several POPS within just one monolithic chip) and die-to-die bandwidth density (over 15 GB/s/mm) than regular chips, and emerges a new Wafer-scale Computing paradigm. Compared to conventional high-performance computing paradigms such as multi-accelerator and datacenter-scale computing, Wafer-scale Computing shows remarkable advantages in communication bandwidth, integration density, and programmability potential. Not surprisingly, disruptive Wafer-scale Computing also brings unprecedented design challenges for hardware architecture, design- $\backslash $ system- technology co-optimization, power and cooling systems, and compiler tool chain. At present, there are no comprehensive surveys summarizing the current state and design insights of Wafer-scale Computing. This article aims to take the first step to help academia and industry review existing wafer-scale chips and essential technologies in a one-stop manner. So that people can conveniently grasp the basic knowledge and key points, understand the achievements and shortcomings of existing research, and contribute to this promising research direction.