Profiling the news spreading barriers using news headlines

Abdul Sittar,Dunja Mladenic,Marko Grobelnik
2023-04-07
Abstract:News headlines can be a good data source for detecting the news spreading barriers in news media, which may be useful in many real-world applications. In this paper, we utilize semantic knowledge through the inference-based model COMET and sentiments of news headlines for barrier classification. We consider five barriers including cultural, economic, political, linguistic, and geographical, and different types of news headlines including health, sports, science, recreation, games, homes, society, shopping, computers, and business. To that end, we collect and label the news headlines automatically for the barriers using the metadata of news publishers. Then, we utilize the extracted commonsense inferences and sentiments as features to detect the news spreading barriers. We compare our approach to the classical text classification methods, deep learning, and transformer-based methods. The results show that the proposed approach using inferences-based semantic knowledge and sentiment offers better performance than the usual (the average F1-score of the ten categories improves from 0.41, 0.39, 0.59, and 0.59 to 0.47, 0.55, 0.70, and 0.76 for the cultural, economic, political, and geographical respectively) for classifying the news-spreading barriers.
Computation and Language,Artificial Intelligence,Machine Learning,Social and Information Networks
What problem does this paper attempt to address?
The paper aims to address the classification of various barriers encountered in the process of news dissemination and to explore how these barriers affect the dissemination of different types of news at social, national, and international levels. The main barriers the authors focus on include cultural barriers, economic barriers, political barriers, linguistic barriers, and geographical barriers. By analyzing the semantic knowledge and emotional tendencies in news headlines, the paper proposes a new method to identify and classify these barriers. To achieve this goal, the authors adopted the following steps: 1. **Data Collection**: Extracted news articles of different categories (such as business, computer, gaming, etc.) from Event Registry. 2. **Metadata Extraction**: Obtained relevant information about the news media through web crawling techniques, such as their political stance, publication language, etc. 3. **Data Annotation**: Annotated the news based on factors such as the geographical location of publication, economic status, cultural differences, etc., to identify whether the information crossed specific barriers. 4. **Sentiment Analysis and Commonsense Inference**: Extracted semantic knowledge about news headlines using sentiment analysis results and commonsense reasoning based on the COMET model. 5. **Comparison of Classification Methods**: Classified news dissemination barriers using traditional machine learning methods (such as logistic regression, naive Bayes, etc.), deep learning methods (such as LSTM), and transformer-based methods (such as BERT), and compared the performance of these methods. The main contributions of the paper include: - Proposing an automatic annotation method for information barriers based on news metadata. - Constructing a benchmark dataset for barrier classification. - Developing a news dissemination barrier classification method based on semantic knowledge and sentiment analysis. Experimental results show that the proposed method, which combines inference-based semantic knowledge and sentiment analysis, outperforms traditional text classification methods, deep learning methods, and transformer-based methods in classifying news dissemination barriers. This indicates that analyzing the semantic knowledge and emotional tendencies in news headlines can effectively identify and understand various barriers in the news dissemination process.