Abstract:In-memory processing is becoming a popular method to alleviate the memory bottleneck of the von Neumann computing model. With the goal of improving both latency and energy cost associated with such in-memory processing, emerging non-volatile memory technologies, such as Spintronic magnetic memory, are of particular interest as they can provide a near-SRAM read/write performance and eliminate nearly all static energy without experiencing any endurance limitations. Spintronic Racetrack Memory (RM) further addresses density concerns of spin-transfer torque memory (STT-MRAM). Moreover, it has recently been demonstrated that portions of RM nanowires can function as a polymorphic gate, which can be leveraged to implement multi-operand bulk bitwise operations. With more complex control, they can also be leveraged to build arithmetic integer and floating point processing in memory (PIM) primitives. This paper proposes SPIMulator, a Spintronic PIM sim ulator that can simulate the storage and PIM architecture of executing PIM commands in Racetrack memory. SPIMulator functionally models the polymorphic gate properties recently proposed for Racetrack memory, which allows transverse access that determines the number of ‘1’s in a segment of each Racetrack nanowire. From this simulation, SPIMulator can report real-time performance statistics such as cycle count and energy. Thus, SPIMulator simulates the multi-operand bit-wise logic operations recently proposed and can be easily extended to implement new PIM operations as they are developed. Due to the functional nature of SPIMulator, it can serve as a programming environment that allows development of PIM-based codes for verification of new acceleration algorithms. We demonstrate the value of SPIMulator through the modeling and estimations of performance and energy consumption of a variety of example applications, including the Advanced Encryption Standard (AES) for encryption primarily based on logical and look-up operations; multiplication of matrices, a frequent requirement in scientific, signal processing, and machine learning algorithms; and bitmap indices a common search table employed for database lookups.

StreamPIM: Streaming Matrix Computation in Racetrack Memory

A design framework for processing-in-memory accelerator

Modeling and Benchmarking Computing-in-Memory for Design Space Exploration.

Novel Hybrid Computing Architecture With Memristor-Based Processing-In-Memory For Data-Intensive Applications

Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM

PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation

SPIMulator: A Spintronic Processing-In-Memory Simulator for Racetracks

Performance Analysis on Structure of Racetrack Memory

SEAL-lab Technical Report – No . 2015-001 ( April 29 , 2016 ) Processing-in-Memory in ReRAM-based Main Memory

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

SEAL-lab Technical Report – No . 2015-001 ( November 30 , 2015 ) Processing-in-Memory in ReRAM-based Main Memory

Performance-Centric Register File Design For Gpus Using Racetrack Memory

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader

Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs

UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space

DDC-PIM: Efficient Algorithm/Architecture Co-design for Doubling Data Capacity of SRAM-based Processing-In-Memory

From Device to System: Cross-layer Design Exploration of Racetrack Memory

Perspectives of Racetrack Memory for Large-Capacity On-Chip Memory: from Device to System

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product