Dancing in the syntax forest: fast, accurate and explainable sentiment analysis with SALSA

Carlos Gómez-Rodríguez,Muhammad Imran,David Vilares,Elena Solera,Olga Kellert
2024-06-23
Abstract:Sentiment analysis is a key technology for companies and institutions to gauge public opinion on products, services or events. However, for large-scale sentiment analysis to be accessible to entities with modest computational resources, it needs to be performed in a resource-efficient way. While some efficient sentiment analysis systems exist, they tend to apply shallow heuristics, which do not take into account syntactic phenomena that can radically change sentiment. Conversely, alternatives that take syntax into account are computationally expensive. The SALSA project, funded by the European Research Council under a Proof-of-Concept Grant, aims to leverage recently-developed fast syntactic parsing techniques to build sentiment analysis systems that are lightweight and efficient, while still providing accuracy and explainability through the explicit use of syntax. We intend our approaches to be the backbone of a working product of interest for SMEs to use in production.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issues of efficiency and accuracy in sentiment analysis, especially in large-scale applications. Specifically, the paper attempts to develop an efficient and lightweight sentiment analysis system that can handle complex language structures while maintaining high accuracy and interpretability. Current sentiment analysis systems face two main problems: 1. **Limitations of Existing Systems**: - **Shallow Heuristic Methods**: While some systems like SentiStrength perform well in handling simple sentiment analysis tasks, they rely on shallow heuristic methods and cannot correctly handle overall sentiment polarity changes determined by grammatical structures. - **High Computational Cost**: Using pre-trained language models or explicit grammar parsing methods can improve accuracy and interpretability, but the computational cost is too high, making it difficult for small entities with limited computing resources to adopt. 2. **Solutions**: - Utilize the fast grammar parsing technology developed in the FASTPARSE project to build a sentiment analysis system that is both efficient and uses explicit grammar to improve accuracy and interpretability. - By treating grammar parsing as a sequence labeling task, significantly improve parsing speed and apply it to sentiment analysis, thereby reducing computational costs and making large-scale sentiment analysis technology affordable for small institutions and companies. Through research in these three aspects, the paper hopes to significantly improve the efficiency of sentiment analysis while ensuring accuracy and interpretability, making it more widely applicable in real-world scenarios.