Abstract:In-memory processing is becoming a popular method to alleviate the memory bottleneck of the von Neumann computing model. With the goal of improving both latency and energy cost associated with such in-memory processing, emerging non-volatile memory technologies, such as Spintronic magnetic memory, are of particular interest as they can provide a near-SRAM read/write performance and eliminate nearly all static energy without experiencing any endurance limitations. Spintronic Racetrack Memory (RM) further addresses density concerns of spin-transfer torque memory (STT-MRAM). Moreover, it has recently been demonstrated that portions of RM nanowires can function as a polymorphic gate, which can be leveraged to implement multi-operand bulk bitwise operations. With more complex control, they can also be leveraged to build arithmetic integer and floating point processing in memory (PIM) primitives. This paper proposes SPIMulator, a Spintronic PIM sim ulator that can simulate the storage and PIM architecture of executing PIM commands in Racetrack memory. SPIMulator functionally models the polymorphic gate properties recently proposed for Racetrack memory, which allows transverse access that determines the number of ‘1’s in a segment of each Racetrack nanowire. From this simulation, SPIMulator can report real-time performance statistics such as cycle count and energy. Thus, SPIMulator simulates the multi-operand bit-wise logic operations recently proposed and can be easily extended to implement new PIM operations as they are developed. Due to the functional nature of SPIMulator, it can serve as a programming environment that allows development of PIM-based codes for verification of new acceleration algorithms. We demonstrate the value of SPIMulator through the modeling and estimations of performance and energy consumption of a variety of example applications, including the Advanced Encryption Standard (AES) for encryption primarily based on logical and look-up operations; multiplication of matrices, a frequent requirement in scientific, signal processing, and machine learning algorithms; and bitmap indices a common search table employed for database lookups.

On Consistency for Bulk-Bitwise Processing-in-Memory

Modeling and Benchmarking Computing-in-Memory for Design Space Exploration.

LazyPIM: Efficient Support for Cache Coherence in Processing-in-Memory Architectures

Enabling Relational Database Analytical Processing in Bulk-Bitwise Processing-In-Memory

On Error Correction for Nonvolatile Processing-In-Memory

Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM

abstractPIM: A Technology Backward-Compatible Compilation Flow for Processing-In-Memory

Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures

FAT-PIM: Low-Cost Error Detection for Processing-In-Memory

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-in-Memory Hardware

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture

UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space

Methodologies, Workloads, and Tools for Processing-in-Memory: Enabling the Adoption of Data-Centric Architectures

SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory

SPIMulator: A Spintronic Processing-In-Memory Simulator for Racetracks

The Bitlet Model: Defining a Litmus Test for the Bitwise Processing-in-Memory Paradigm

How to be consistent with persistent memory? An evaluation approach

A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader

AritPIM: High-Throughput In-Memory Arithmetic

PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation