Overlapping Community Detection in Software Ecosystem Based on Pheromone Guided Personalized PageRank Algorithm

Simin Wang,Xiangjuan Yao,Dunwei Gong,Huijie Tu
DOI: https://doi.org/10.1016/j.infsof.2023.107283
IF: 3.9
2023-01-01
Information and Software Technology
Abstract:Context: Software ecosystem has aroused the interest of numerous researchers and plays an important role in many aspects. According to different participants and their relationships, software ecosystem can be constructed into various types of complex networks. Overlapping community detection in complex networks can help reveal the community structure and find the intersection between communities. However, existing overlapping community detection algorithms often suffer from reduced applicability or accuracy when applied to networks in software ecosystems. Objective: To reveal the overlapping community structure in software ecosystem, we propose an overlapping community detection algorithm by improving the standard personalized PageRank (PPR) algorithm. Method: We first construct a developer collaboration network in software ecosystem based on the intensity of cooperation between developers. Then, the similarity between developers is calculated to guide the walking process of the PPR algorithm, making it suitable for weighted networks. Finally, in the proposed algorithm PGPPR, inspired by the idea of using pheromones to guide the walking in ant colony algorithm, we run the algorithm in multiple rounds and use the results of the previous round to guide the walking process in the current round to reduce redundant diffusion. Results: The experimental results on the five real-world networks show that our algorithm is applicable and effective in detecting communities. And in the five developer collaboration networks, PGPPR can effectively detect overlapping community structures with higher stability and accuracy than the four baselines. Conclusion: The PGPPR algorithm can find overlapping community structures in weighted networks and effectively reduce the redundant diffusion generated when applying the standard PPR algorithm to community detection. Compared to other algorithms, our algorithm can detect the overlapping communities more accurately and stably when applied to developer collaboration networks in software ecosystem.
What problem does this paper attempt to address?