Identifying Scholarly Communities From Unstructured Texts

Ming Liu,Yang Chen,Bo Lang,Li Zhang,Hongting Niu
DOI: https://doi.org/10.1007/978-3-319-96890-2_7
2018-01-01
Abstract:Scholarly community detection has important applications in various fields. Previous studies have relied heavily on structured scholar networks, which have high computational complexity and are challenging to construct in practice. We propose a novel alternative that can identify scholarly communities directly from large textual corpora. To our knowledge, this is the first study intended to detect communities directly from unstructured texts. Generally, academic articles tend to mention related work and researchers. Researchers that are more closely related to each other are mentioned in a closer grouping in lines of academic text. Based on this correlation, we develop an intuitional method that measures the mutual relatedness of researchers through their textual distance. First, we extract and disambiguate the researcher names from academic articles. Then, we embed each researcher as an implicit vector and measure the relatedness of researchers by their vector distance. Finally, the communities are identified by vector clusters. We implement and evaluate our method on three real-world datasets. The experimental results demonstrate that our method achieves better performance than state-of-the-art methods.
What problem does this paper attempt to address?