Task-oriented Keyphrase Extraction from Social Media

Min Yang,Yuzhi Liang,Wei Zhao,Wei Xu,Jia Zhu,Qiang Qu
DOI: https://doi.org/10.1007/s11042-017-5041-y
IF: 2.577
2017-01-01
Multimedia Tools and Applications
Abstract:Keyphrase extraction from social media is a crucial and challenging task. Previous studies usually focus on extracting keyphrases that provide the summary of a corpus. However, they do not take users’ specific needs into consideration. In this paper, we propose a novel three-stage model to learn a keyphrase set that represents or related to a particular topic. Firstly, a phrase mining algorithm is applied to segment the documents into human-interpretable phrases. Secondly, we propose a weakly supervised model to extract candidate keyphrases, which uses a few pre-specific seed keyphrases to guide the model. The model consequently makes the extracted keyphrases more specific and related to the seed keyphrases (which reflect the user’s needs). Finally, to further identify the implicitly related phrases, the PMI-IR algorithm is employed to obtain the synonyms of the extracted candidate keyphrases. We conducted experiments on two publicly available datasets from news and Twitter. The experimental results demonstrate that our approach outperforms the state-of-the-art baselines and has the potential to extract high-quality task-oriented keyphrases.
What problem does this paper attempt to address?