Frequent Subgraph Mining Based Collaboration Pattern Analysis for Wikipedia.

Zheng Hu,Zhonghu Zuo,Chunhong Zhang,Xiaosheng Tang,Yuqian Tang
DOI: https://doi.org/10.5755/j01.itc.48.2.20028
IF: 0.813
2019-01-01
Information Technology And Control
Abstract:Online knowledge collaborations, where distributed members without hierarchies self-organize themselvesto create valuable contents, are prevalent in many open production systems such as Wikipedia, GitHub andsocial networks. While many existing studies from network science have been brought to analyse the general interactivebehavioural patterns embedded in these systems, how the collaborations influence the achievement outcomes hasnot been thoroughly investigated. In this paper, we mine the collaboration patterns from a micro perspective to deeplyunderstand the relationships between the collaboration among participants and the qualities of theWikipedia articles.In particular, the subgraphs contained in the collaboration networks derived from theWikipedia revision histories aretaken as the fundamental units to analyse the collaboration diversities from the subgraph properties such as size andtopology. In contrast to the predefined static motifs adopted by the previous works, the collaboration subgraphs aredirectly found from Wikipedia dataset by a frequent subgraph mining algorithm GRAMI, which is able to capturethe real dynamic collaboration patterns. Moreover, the relationships between the co-authors in the subgraphs are alsodiscriminated to further explore the collaboration patterns. The experiments exhibit the statistical properties of thecollaboration subgraphs and the efficiency of them as the metrics for the article quality assessments. We concludethat a small group of editors with relative frequent fixed collaboration patterns contribute more to the excellent articlequality than the professional extents of arbitrary individuals in the collaboration group. This discovery confirms thecommonly insight about collaboration that many heads are always better than one and concretely suggests a potentialexplanation for the increasing prevalence and success of the online knowledge collaborations
What problem does this paper attempt to address?