Find Potential Partners: A GitHub User Recommendation Method Based on Event Data

Shuotong Bai,Lei Liu,Huaxiao Liu,Mengxi Zhang,Chenkun Meng,Peng Zhang
DOI: https://doi.org/10.1016/j.infsof.2022.106961
IF: 3.9
2022-01-01
Information and Software Technology
Abstract:Context: GitHub has attracted much popularity among a large number of software developers around the world and introduced the social function follow to strengthen the relationship among developers. Like other social networks, GitHub users usually follow others who are popular in the community, co-workers, or friends in real life. However, according to our investigation, more than half of GitHub users prefer to follow recently like-minded developers other than their traditional networks for communicating with timely feedback, discovering niche repositories, and attracting more active contributors to cooperate, while these users are hard to find. Objective: Our objective in this paper is to leverage recent activities-Event Data of GitHub users and conduct a recommendation approach to help them match some recently like-minded developers to follow or reach out. Methods: As a first step, we conduct one empirical research-an online survey to investigate and analyze the opinions of GitHub users whether they are willing to follow others with similar recent events and which kind of events they will focus on during the follow process. Regarding the results from our survey, we partition 12 types of events focused by participants into three Event sets of Communication, Exploration, and Cooperation. As a second step, we collect Event Data of 12,713 GitHub users who participated in repositories written in python and build a time-based multi-dimensional recommendation approach based on a calculating vector-similarity method, a clustering approach, and a deep learning model. Results and Conclusion: The experimental results show that our approach achieves an improvement of 607.64%, 564.59%, and 599.19% on average compared with two baselines in terms of Precision@N, Recall@N, and F1 - Score@N. Such a series of experiments have proved that our method is effective and feasible.
What problem does this paper attempt to address?