Enhancing Scalability and Performance in Influence Maximization with Optimized Parallel Processing

Hanjiang Wu,Huan Xu,Joongun Park,Jesmin Jahan Tithi,Fabio Checconi,Jordi Wolfson-Pou,Fabrizio Petrini,Tushar Krishna
2024-11-14
Abstract:Influence Maximization (IM) is vital in viral marketing and biological network analysis for identifying key influencers. Given its NP-hard nature, approximate solutions are employed. This paper addresses scalability challenges in scale-out shared memory system by focusing on the state-of-the-art Influence Maximization via Martingales (IMM) benchmark. To enhance the work efficiency of the current IMM implementation, we propose EFFICIENTIMM with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures. Benchmarking on a 128-core CPU system with 8 NUMA nodes, EFFICIENTIMM demonstrated significant performance improvements, achieving an average 5.9x speedup over Ripples across 8 diverse SNAP datasets, when compared to the best execution times of the original Ripples framework. Additionally, on the Youtube graph, EFFICIENTIMM demonstrates a better memory access pattern with 357.4x reduction in L1+L2 cache misses as compared to Ripples.
Distributed, Parallel, and Cluster Computing,Data Structures and Algorithms
What problem does this paper attempt to address?