UpDown: Programmable fine-grained Events for Scalable Performance on Irregular Applications

Andronicus Rajasukumar,Jiya Su,Yuqing,Wang,Tianshuo Su,Marziyeh Nourian,Jose M Monsalve Diaz,Tianchi Zhang,Jianru Ding,Wenyi Wang,Ziyi Zhang,Moubarak Jeje,Henry Hoffmann,Yanjing Li,Andrew A. Chien
2024-07-30
Abstract:Applications with irregular data structures, data-dependent control flows and fine-grained data transfers (e.g., real-world graph computations) perform poorly on cache-based systems. We propose the UpDown accelerator that supports fine-grained execution with novel architecture mechanisms - lightweight threading, event-driven scheduling, efficient ultra-short threads, and split-transaction DRAM access with software-controlled synchronization. These hardware primitives support software programmable events, enabling high performance on diverse data structures and algorithms. UpDown also supports scalable performance; hardware replication enables programs to scale up performance. Evaluation results show UpDown's flexibility and scalability enable it to outperform CPUs on graph mining and analytics computations by up to 116-195x geomean speedup and more than 4x speedup over prior accelerators. We show that UpDown generates high memory parallelism (~4.6x over CPU) required for memory intensive graph computations. We present measurements that attribute the performance of UpDown (23x architectural advantage) to its individual architectural mechanisms. Finally, we also analyze the area and power cost of UpDown's mechanisms for software programmability.
Hardware Architecture
What problem does this paper attempt to address?