A comprehensive survey on deep learning-based approaches for multimodal sentiment analysis

Alireza Ghorbanali,Mohammad Karim Sohrabi
DOI: https://doi.org/10.1007/s10462-023-10555-8
IF: 9.588
2023-07-28
Artificial Intelligence Review
Abstract:Sentiment analysis is an important natural language processing issue that has many applications in various fields. The increasing popularity of social networks and growth and development of their related tools and technologies has led to share the users’ multimodal content and opinions in a hybrid form of different media, including texts, images, videos, audio and emojis. The increasing interest of users to share their content using a combination of several media has significantly increased the amount of multimodal data. Most of the comments that users post in the social media have emotional aspects and provide useful indicators for many purposes. Compared to single-modal data, such as text-only or image-only comments, multimodal data contain more useful information and leads to better understanding of the real sentiments of users. Many studies have been conducted in this area, each of which deals with one or some of the various common challenges of multimodal sentiment analysis methods, including incomplete data, heterogeneity of modals, fusion method of the results, interactions between modals, and existence of unrelated, insufficient and redundant data and information. The emergence of deep neural networks and the evolution of deep learning tools and techniques has led to the development of deep learning-based approaches to multimodal sentiment analysis to address its challenges and constraints. This paper is a comprehensive comparative survey of sentiment analysis approaches, challenges, applications, and trends, with a special focus on deep learning-based multimodal sentiment analysis methods. Examining the limitations of the recent studies, describing possible future solutions and evaluating existing challenges are also taken into consideration and future direction of the methods are evaluated.
computer science, artificial intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges and limitations in Multimodal Sentiment Analysis (MSA). With the rise of social media, users are increasingly using multiple media forms such as text, image, video, audio and emoticons to express opinions and emotions. This has led to the generation of a large amount of multimodal data, which contains more useful information than single - modal data (such as pure text or pure image comments) and helps to better understand the actual emotions of users. However, multimodal sentiment analysis faces many challenges, including: 1. **Incomplete data**: In multimodal data, information in some modalities may be missing. 2. **Modal heterogeneity**: Data of different modalities have different characteristics and representation methods. How to effectively fuse these modalities is a challenge. 3. **Result fusion method**: How to effectively fuse the analysis results of different modalities to improve the overall sentiment analysis performance. 4. **Interaction between modalities**: There may be complex interaction relationships between different modalities, and effective modeling methods are required. 5. **Irrelevant, insufficient and redundant data**: Multimodal data may contain a large amount of irrelevant, insufficient or redundant information. How to deal with these problems is also a challenge. To address these challenges, this paper conducts a comprehensive review of deep - learning - based multimodal sentiment analysis methods, focusing on the following points: - **Limitations of existing research**: Analyzes the limitations and complexity of current multimodal sentiment analysis methods. - **Future solutions**: Proposes possible future solutions, evaluates existing challenges, and explores future research directions. - **Fusion methods**: Describes in detail different data fusion strategies and their impact on the efficiency of multimodal sentiment analysis. Through these contents, the paper aims to provide a comprehensive picture of the field of multimodal sentiment analysis, helping researchers and practitioners better understand and deal with the challenges in this field.