Abstract:Read mapping, which maps billions of reads to a reference DNA, poses a significant performance bottleneck in genomic analysis. Current accelerators for read mapping are primarily bounded by the intensive and random memory access to huge datasets. Near-data processing (NDP) infrastructures are promising to provide extremely high bandwidth. However, existing frameworks failed to reach this potential due to poor locality and high redundancy. Our idea is to introduce prediction under the insight that candidate mapping positions become predictable when the reference is organized in coarse-grain slices. We present GEM ( Ge nomic M emory), an ultra-efficient near-memory accelerator for read mapping. GEM adopts a novel data-centric framework, named dividing-and-predictive-scattering (DPS), which synthesizes information of seed existence to predict the target mapping locations to reduce memory access redundancy. During preparation, DPS divides the reference into coarse-grained slices and creates predictive filters to assess the likelihood of reads belonging to each slice. During mapping, DPS predicts and scatters reads to considerably fewer slices compared than without prediction. By employing small on-chip SRAM-based predictors with high accuracy, DPS minimizes unnecessary DRAM access and data movement from remote memory. In essence, DPS trades pre-seeding predictors for localized access patterns and low redundancy, hence achieving high throughput for data-intensive applications. We implement GEM by integrating coarse-grain reconfigurable architectures (CGRAs) in the logic layer of a 3D-stacked DRAM infrastructure, utilizing the massive banks as slices. GEM leverages CGRAs for their flexibility in supporting various algorithms tailored to different datasets. Bloom filters are leveraged for slice prediction, providing an error rate below 1%. Evaluation results demonstrate that GEM reduces memory requests by 95% and alignments by 87%, achieving a throughput improvement of 15.3× and 11.0× compared to compute-centric and broadcast-based baselines on the same NDP platform. Overall, GEM achieves a $3.5\times$ throughput improvement and $2.1\times$ energy efficiency compared to state-of-the-art ASIC accelerators.

GRIM-filter: fast seed filtering in read mapping using emerging memory technologies

GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping Using Processing-in-Memory Technologies

Accelerating Seed Location Filtering in DNA Read Mapping Using a Commercial Compute-in-SRAM Architecture

An In-Memory Architecture for High-Performance Long-Read Pre-Alignment Filtering

Seed-and-Vote based In-Memory Accelerator for DNA Read Mapping

An Efficient Filtration Method Based on Variable-Length Seeds for Sequence Alignment.

Accelerating DNA Read Mapping with Digital Processing-in-Memory

GEM: Ultra-Efficient Near-Memory Reconfigurable Acceleration for Read Mapping by Dividing and Predictive Scattering

GateSeeder: Near-memory CPU-FPGA Acceleration of Short and Long Read Mapping

GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read Mapping

Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping

GateKeeper-GPU: Fast and Accurate Pre-Alignment Filtering in Short Read Mapping

The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote

SeGraM: A Universal Hardware Accelerator for Genomic Sequence-to-Graph and Sequence-to-Sequence Mapping

ReadsMap: a new tool for high precision mapping of DNAseq and RNAseq read sequences

AMAS: Optimizing the Partition and Filtration of Adaptive Seeds to Speed Up Read Mapping

Multi-context seeds enable fast and high-accuracy read mapping

PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment

Mapping short reads to a genome without using hash look-up table algorithm and Burrows Wheeler Transformation

Perm: Efficient Mapping of Short Sequencing Reads with Periodic Full Sensitive Spaced Seeds

Bitmapper: an Efficient All-Mapper Based on Bit-Vector Computing