Talking Wikidata: Communication patterns and their impact on community engagement in collaborative knowledge graphs

Elisavet Koutsiana,Ioannis Reklos,Kholoud Saad Alghamdi,Nitisha Jain,Albert Meroño-Peñuela,Elena Simperl
2024-07-24
Abstract:We study collaboration patterns of Wikidata, one of the world's largest collaborative knowledge graph communities. Wikidata lacks long-term engagement with a small group of priceless members, 0.8%, to be responsible for 80% of contributions. Therefore, it is essential to investigate their behavioural patterns and find ways to enhance their contributions and participation. Previous studies have highlighted the importance of discussions among contributors in understanding these patterns. To investigate this, we analyzed all the discussions on Wikidata and used a mixed methods approach, including statistical tests, network analysis, and text and graph embedding representations. Our research showed that the interactions between Wikidata editors form a small world network where the content of a post influences the continuity of conversations. We also found that the account age of Wikidata members and their conversations are significant factors in their long-term engagement with the project. Our findings can benefit the Wikidata community by helping them improve their practices to increase contributions and enhance long-term participation.
Social and Information Networks,Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How can the Wikidata community improve long - term participation and contribution rate by analyzing the communication patterns among members?** Specifically, as one of the world's largest collaborative knowledge graph communities, Wikidata is faced with the problem of insufficient long - term participation. Although 0.8% of the core members are responsible for 80% of the contributions, the overall activity of the community is still low. Therefore, it is crucial to study the behavior patterns of these core members and find ways to enhance their contributions and participation. In addition, the importance of discussions emphasized by previous studies in understanding these behavior patterns also prompts the author to conduct in - depth analysis of the discussion content in Wikidata. ### Research Background 1. **Current Situation of the Problem**: - In the Wikidata community, only a very small number of members (0.8%) are responsible for the vast majority (80%) of the contributions. - The main challenge faced by the community is how to improve long - term participation, especially the continuous participation of those high - contribution members. 2. **Research Motivation**: - To understand the communication patterns among members and their impact on community participation. - To explore how to increase contributions and improve long - term participation by improving the communication mechanism. 3. **Research Method**: - Use a mixed method, including statistical tests, network analysis, and text and graph embedding representations. - Analyze all discussions about Wikidata to reveal the interaction patterns among members. ### Research Questions To achieve the above goals, the author proposes three research questions (RQs): - **RQ1**: What are the characteristics of editors' collaboration? - **RQ2**: What factors affect whether a discussion can get a response? - **RQ3**: Does the discussion affect editors' participation? ### Main Findings 1. **Small - World Network Characteristics**: - The interactions among Wikidata editors form a small - world network, which is characterized by a high clustering coefficient and a low shortest path length. - This network structure indicates that the discussions among editors are highly connected and closely related. 2. **Impact of Discussion Content**: - The content of the discussion significantly affects the continuity of the conversation. High - quality discussions are more likely to trigger subsequent exchanges. 3. **Impact of Account Age**: - The account age of editors and their discussion frequency have a significant impact on their long - term participation. Older users are more likely to remain long - term active. ### Practical Applications The results of this study can help the Wikidata community improve its practices and tools to increase contributions and improve long - term participation. In addition, the research results can also provide better suggestions for the recommendation system to help editors better participate in discussions and contributions. Through these findings, the author hopes to provide valuable insights for the Wikidata community, thereby promoting a healthier and more efficient collaborative environment.