Efficient Enumeration of Large Maximal k-Plexes

Qihao Cheng,Da Yan,Tianhao Wu,Lyuheng Yuan,Ji Cheng,Zhongyi Huang,Yang Zhou
2024-06-10
Abstract:Finding cohesive subgraphs in a large graph has many important applications, such as community detection and biological network analysis. Clique is often a too strict cohesive structure since communities or biological modules rarely form as cliques for various reasons such as data noise. Therefore, $k$-plex is introduced as a popular clique relaxation, which is a graph where every vertex is adjacent to all but at most $k$ vertices. In this paper, we propose a fast branch-and-bound algorithm as well as its task-based parallel version to enumerate all maximal $k$-plexes with at least $q$ vertices. Our algorithm adopts an effective search space partitioning approach that provides a lower time complexity, a new pivot vertex selection method that reduces candidate vertex size, an effective upper-bounding technique to prune useless branches, and three novel pruning techniques by vertex pairs. Our parallel algorithm uses a timeout mechanism to eliminate straggler tasks, and maximizes cache locality while ensuring load balancing. Extensive experiments show that compared with the state-of-the-art algorithms, our sequential and parallel algorithms enumerate large maximal $k$-plexes with up to $5 \times$ and $18.9 \times$ speedup, respectively. Ablation results also demonstrate that our pruning techniques bring up to $7 \times$ speedup compared with our basic algorithm.
Data Structures and Algorithms,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The paper attempts to address the problem of efficiently enumerating all maximal \( k \)-plexes with at least \( q \) vertices in large graphs. Specifically, the paper proposes a fast branch-and-bound algorithm and its task-parallel version for enumerating all maximal \( k \)-plexes with at least \( q \) vertices. The main contributions of the paper include: 1. **Search Space Partitioning Method**: An effective search space partitioning method is proposed, treating each set enumeration subtree as an independent task, thereby enabling parallel processing. 2. **New Pivot Selection Method**: A new pivot selection method is introduced, which maximizes the number of saturated vertices in the current \( k \)-plex, effectively reducing the number of candidate vertices. 3. **Upper Bound Technique**: An efficient upper bound technique is designed to prune useless branches. 4. **Vertex Pair-Based Pruning Techniques**: Three new vertex pair-based pruning techniques are proposed to further reduce the search space. 5. **Task Parallel Computing Method**: A task-based parallel computing method is proposed, combined with a timeout mechanism to eliminate tail tasks and ensure load balancing. The comprehensive application of these techniques makes the proposed algorithm significantly outperform existing methods in terms of performance. Experimental results show that compared to the state-of-the-art algorithms, the proposed sequential and parallel algorithms achieve up to 5 times and 18.9 times speedup, respectively.