An Annotation System to Annotate Healthcare Information from Tweets

Nixon Dutta,Anupam Mondal,Pritam Paul
DOI: https://doi.org/10.1007/978-981-13-7403-6_30
2019-07-17
Abstract:This paper presents a unique idea to utilize social media data for the betterment of healthcare service, provided by the doctors and related industries. The researchers have observed that every day, on average, around 5 million tweets are tweeted on Twitter and around 2 billion tweets per year related to health care. This huge data source can be used and analyzed to obtain better knowledge about recent trends and discoveries in a particular field. Hence, we are motivated to develop a structured corpus from scratch, identify concepts and categories using the machine learning approach on the extracted unstructured and semi-structured corpora. In order to build the system, we have employed two well-known classifiers, namely multinomial Naive Bayes and support vector machine on the top of our prepared experimental dataset. The training and test datasets are part of the experimental dataset and have been used to build the module and validate them, respectively. The proposed module is able to assign healthcare concepts and their categories for tweets. Finally, the validation offers F-measure 0.67 and 0.57 for concept identification as well as categorization system individually. This annotation system may help to design various applications such as summarization and recommendation system in health care for assisting medical practitioners.
What problem does this paper attempt to address?