A Hybrid Approach for Tag Hierarchy Construction
Shangwen Wang,Tao Wang,Xiaoguang Mao,Gang Yin,Yue Yu
DOI: https://doi.org/10.1007/978-3-319-90421-4_4
2018-01-01
Abstract:Open source resources are playing a more and more important role in software engineering for reuse. However, the dramatically increasing scale of these resources brings great challenges for their management and location. In this study, we propose a hybrid approach for automatic tag hierarchy construction, which combines the tag co-occurrence relations and domain knowledge to build and optimize the hierarchy. We firstly calculate the generality of each tag in accordance with the co-occurrence relationship with others, and construct the hierarchy based on the generality. Then we leverage the domain knowledge of existing hierarchical categories to perform an optimization and promote the final hierarchy. We select 8064 projects in Openhub community and 10703 posts in StackOverflow community as the original data and use the information of the SourceForge community as the domain knowledge. We conduct extensive experiments and evaluate our approach by utilizing Wordnet and F-measure method. The results show that our approach exhibits better performance than others with accuracy rate and recall that exceed 90%.
What problem does this paper attempt to address?