StreamPIM: Streaming Matrix Computation in Racetrack Memory

Yuda An,Yunxiao Tang,Shushu Yi,Li Peng,Xiurui Pan,Guangyu Sun,Zhaochu Luo,Qiao Li,Jie Zhang
DOI: https://doi.org/10.1109/hpca57654.2024.00031
2024-01-01
Abstract:Racetrack memory (RM) techniques have become promising solutions to resolve the memory wall issue as they increase memory density, reduce energy consumption and are capable of building processing-in-memory (PIM) architectures. RM can place arithmetic logic units in or near its memory arrays to process tasks offloaded by the host. While there already exist multiple studies of processing in RM, these solutions, unfortunately, suffer from data transfer overheads imposed by the loose coupling of the memory core and the computation units. To address this issue, we propose StreamPIM, a new processing-in-RM architecture, which tightly couples the memory core and the computation units. Specifically, StreamPIM directly constructs a matrix processor from domain-wall nanowires without the usage of CMOS-based computation units. It also designs a domainwall nanowire-based bus, which can eliminate electromagnetic conversion. StreamPIM further optimizes the performance by leveraging RM internal parallelism. Our evaluation results show that StreamPIM achieves 39.1x higher performance and saves 58.4x energy consumption, compared with the traditional computing platform.
What problem does this paper attempt to address?