Abstract:The huge popularity of social media platforms like Twitter attracts a large fraction of users to share real-time information and short situational messages during disasters. A summary of these tweets is required by the government organizations, agencies, and volunteers for efficient and quick disaster response. However, the huge influx of tweets makes it difficult to manually get a precise overview of ongoing events. To handle this challenge, several tweet summarization approaches have been proposed. In most of the existing literature, tweet summarization is broken into a two-step process where in the first step, it categorizes tweets, and in the second step, it chooses representative tweets from each category. There are both supervised as well as unsupervised approaches found in literature to solve the problem of first step. Supervised approaches requires huge amount of labelled data which incurs cost as well as time. On the other hand, unsupervised approaches could not clusters tweet properly due to the overlapping keywords, vocabulary size, lack of understanding of semantic meaning etc. While, for the second step of summarization, existing approaches applied different ranking methods where those ranking methods are very generic which fail to compute proper importance of a tweet respect to a disaster. Both the problems can be handled far better with proper domain knowledge. In this paper, we exploited already existing domain knowledge by the means of ontology in both the steps and proposed a novel disaster summarization method OntoDSumm. We evaluate this proposed method with 4 state-of-the-art methods using 10 disaster datasets. Evaluation results reveal that OntoDSumm outperforms existing methods by approximately 2-66% in terms of ROUGE-1 F1 score.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: during disaster events, how to generate effective summaries from a large number of Tweets to help government organizations, institutions and volunteers quickly and accurately understand the disaster situation and make effective responses. Specifically, the paper mainly solves the following two problems: 1. **Challenges in Tweet classification**: - Existing Tweet classification methods are divided into two categories: supervised learning and unsupervised learning. Supervised learning requires a large amount of labeled data, which is both time - consuming and expensive; while unsupervised learning cannot classify Tweets well due to problems such as vocabulary overlap and insufficient semantic understanding. 2. **Challenges in representative Tweet selection**: - When selecting representative Tweets, existing methods usually use general - purpose ranking algorithms, which cannot accurately assess the importance of Tweets in specific disasters. In addition, the importance of various types of information in different disaster events also varies, and existing methods have not fully considered this point. To solve these problems, the author proposes an ontology - based Tweet summarization method - **OntoDSumm**. This method systematically solves the above problems through three stages: - **Phase - I**: Utilize existing knowledge in the disaster field (such as the Empathi ontology) to automatically classify Tweets into different categories. This stage is unsupervised, but the classification accuracy is improved by expanding the ontology vocabulary. - **Phase - II**: Propose a new scoring mechanism to automatically predict the importance of each category in a given disaster event. By calculating the "disaster similarity index", find historical disasters similar to the current disaster, and determine the importance of each category accordingly. - **Phase - III**: Propose an improved Disaster - specific Maximal Marginal Relevance (DMMR) algorithm to select the most representative Tweets from each category, ensuring that the summarized information is comprehensive and diverse. Through these three stages, OntoDSumm can generate Tweet summaries for disaster events more effectively. Compared with existing methods, it improves the ROUGE - 1 F1 score by approximately 2% - 66%. ### Formula presentation To better understand the working principle of OntoDSumm, the following are some key formulas involved in the paper: 1. **Semantic Similarity Score**: \[ \text{SemSIM}(T_j, C_i)=\frac{|Kw(T_j)\cap Kw(C_i)|}{|Kw(T_j)\cup Kw(C_i)|} \] where \(Kw(T_j)\) is the set of keywords of Tweet \(T_j\), and \(Kw(C_i)\) is the set of keywords of category \(C_i\). 2. **Maximal Semantic Similarity Score (MaxSIM)**: \[ \text{MaxSIM}(T_j)=\arg\max_{i\in K}(\text{SemSIM}(T_j, C_i)) \] 3. **Objective function for summary generation**: \[ T^*=\arg\max_{T_j\in C_i}(\alpha\cdot ICov(T_j, In(C_i))+\beta\cdot Div(T_j, S)) \] where \(ICov(T_j, In(C_i))\) represents the information coverage rate of Tweet \(T_j\) for category \(C_i\), \(Div(T_j, S)\) represents the diversity of Tweet \(T_j\) after being added to the summary \(S\), and \(\alpha\) and \(\beta\) are adjustable parameters. Through these formulas, OntoDSumm can improve the diversity and representativeness of the summary while ensuring information coverage.

OntoDSumm : Ontology based Tweet Summarization for Disaster Events

IKDSumm: Incorporating Key-phrases into BERT for extractive Disaster Tweet Summarization

ADSumm: Annotated Ground-truth Summary Datasets for Disaster Tweet Summarization

ATSumm: Auxiliary information enhanced approach for abstractive disaster Tweet Summarization with sparse training data

PORTRAIT: a hybrid aPproach tO cReate extractive ground-TRuth summAry for dIsaster evenT

Ontology-Enriched Multi-Document Summarization In Disaster Management

Descriptive and visual summaries of disaster events using artificial intelligence techniques: case studies of Hurricanes Harvey, Irma, and Maria

On Identifying Disaster-Related Tweets: Matching-based or Learning-based?

Identification and Classification of Informative Tweets During Disasters

Utilizing Microblogs for Assisting Post-Disaster Relief Operations via Matching Resource Needs and Availabilities

Twitter Speaks: A Case of National Disaster Situational Awareness

Natural Disaster Analysis using Satellite Imagery and Social-Media Data for Emergency Response Situations

Semi-supervised Discovery of Informative Tweets During the Emerging Disasters

Multimodal tweet classification in disaster response systems using transformer-based bidirectional attention model

Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive Approach Using Transformers

Design and analysis of microblog-based summarization system

Efficacy of BERT embeddings on predicting disaster from Twitter data

Microblog Retrieval for Post-Disaster Relief: Applying and Comparing Neural IR Models

A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media

Classifying Relevant Social Media Posts During Disasters Using Ensemble of Domain-agnostic and Domain-specific Word Embeddings

Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma