OPIMA: Optical Processing-In-Memory for Convolutional Neural Network Acceleration

Febin Sunny,Amin Shafiee,Abhishek Balasubramaniam,Mahdi Nikdast,Sudeep Pasricha
2024-07-11
Abstract:Recent advances in machine learning (ML) have spotlighted the pressing need for computing architectures that bridge the gap between memory bandwidth and processing power. The advent of deep neural networks has pushed traditional Von Neumann architectures to their limits due to the high latency and energy consumption costs associated with data movement between the processor and memory for these workloads. One of the solutions to overcome this bottleneck is to perform computation within the main memory through processing-in-memory (PIM), thereby limiting data movement and the costs associated with it. However, DRAM-based PIM struggles to achieve high throughput and energy efficiency due to internal data movement bottlenecks and the need for frequent refresh operations. In this work, we introduce OPIMA, a PIM-based ML accelerator, architected within an optical main memory. OPIMA has been designed to leverage the inherent massive parallelism within main memory while performing high-speed, low-energy optical computation to accelerate ML models based on convolutional neural networks. We present a comprehensive analysis of OPIMA to guide design choices and operational mechanisms. Additionally, we evaluate the performance and energy consumption of OPIMA, comparing it with conventional electronic computing systems and emerging photonic PIM architectures. The experimental results show that OPIMA can achieve 2.98x higher throughput and 137x better energy efficiency than the best-known prior work.
Hardware Architecture,Emerging Technologies,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the "von Neumann bottleneck" problem encountered by existing computing architectures when handling deep neural network (DNN) workloads. Specifically, the traditional von Neumann architecture cannot meet the increasing computational demands of modern machine - learning models due to high latency and high energy consumption caused by the frequent movement of data between the processor and memory. To solve this bottleneck problem, the paper proposes a photon - memory - based processing - in - memory (PIM) architecture - OPIMA (Optical Processing - In - Memory Accelerator). OPIMA aims to accelerate convolutional neural networks (CNN) by performing optical computing in the main memory, thereby reducing data movement and improving energy efficiency. The following are the specific problems that the paper attempts to solve: 1. **Limitations of traditional PIM architectures**: - DRAM - based PIM is difficult to achieve high throughput and high energy efficiency due to internal data movement bottlenecks and frequent refresh operations. - Non - volatile memories (such as ReRAM, STT - RAM) face manufacturing challenges and durability problems. - Although PCM has high energy efficiency and bit density, it has non - linear response and resistance drift problems under electrical control. 2. **Advantages of photon computing**: - Photon computing can take advantage of the parallelism and low - energy - consumption characteristics of light waves and is suitable for large - scale matrix operations. - By optimizing the optical properties of PCM materials, the accuracy and speed of data reading can be improved without increasing power consumption. 3. **Design goals of the OPIMA architecture**: - Provide efficient multi - bit - density storage units to support complex ML computations. - Achieve high - speed, low - energy - consumption optical computing, thereby significantly improving the throughput and energy efficiency of ML inference. - Solve the data interference and thermal crosstalk problems in traditional PIM architectures to ensure reliable computing performance. The paper shows that OPIMA has significant advantages in performance and energy consumption compared to existing electronic computing systems and other emerging photon PIM architectures. Experimental results show that OPIMA can achieve 2.98 times higher throughput and 137 times better energy efficiency. ### Formula summary The key formulas involved in the paper include: 1. **Optical transmission change model**: \[ T_{\text{out}} = T_{\text{in}} - \Delta T_s - P_{\text{abs}} \] where: - \( T_{\text{out}} \) is the output transmission, - \( T_{\text{in}} \) is the input power, - \( \Delta T_s \) is the optical transmission change due to light scattering and back - reflection, - \( P_{\text{abs}} \) is the total power absorbed by the PCM unit. 2. **Objective of optimized design**: \[ T_{\text{out}} = (T_{\text{in}} - P_{\text{abs}}) \rightarrow \Delta T_s = 0 \] Ensure that the signal change is fully represented by the written data (\( P_{\text{abs}} \)). 3. **Optical transmission contrast**: \[ \Delta T = T_{\text{amorphous}} - T_{\text{crystalline}} \] where: - \( T_{\text{amorphous}} \) is the optical transmission in the amorphous state, - \( T_{\text{crystalline}} \) is the optical transmission in the crystalline state. Through these improvements, OPIMA can effectively solve the bottleneck problems faced by traditional computing architectures when handling deep - learning tasks and provide more efficient and energy - saving solutions.