Clustering Filipino Disaster-Related Tweets Using Incremental and Density-Based Spatiotemporal Algorithm with Support Vector Machines for Needs Assessment

Ocean M. Barba,Franz Arvin T. Calbay,Angelica Jane S. Francisco,Angel Luis D. Santos,Charmaine S. Ponay
DOI: https://doi.org/10.48550/arXiv.2108.06853
2021-08-16
Abstract:Social media has played a huge part on how people get informed and communicate with one another. It has helped people express their needs due to distress especially during disasters. Because posts made through it are publicly accessible by default, Twitter is among the most helpful social media sites in times of disaster. With this, the study aims to assess the needs expressed during calamities by Filipinos on Twitter. Data were gathered and classified as either disaster-related or unrelated with the use of Naïve Bayes classifier. After this, the disaster-related tweets were clustered per disaster type using Incremental Clustering Algorithm, and then sub-clustered based on the location and time of the tweet using Density-based Spatiotemporal Clustering Algorithm. Lastly, using Support Vector Machines, the tweets were classified according to the expressed need, such as shelter, rescue, relief, cash, prayer, and others. After conducting the study, results showed that the Incremental Clustering Algorithm and Density-Based Spatiotemporal Clustering Algorithm were able to cluster the tweets with f-measure scores of 47.20% and 82.28% respectively. Also, the Naïve Bayes and Support Vector Machines were able to classify with an average f-measure score of 97% and an average accuracy of 77.57% respectively.
Computation and Language,Machine Learning
What problem does this paper attempt to address?