Multimodal Sentiment Analysis of Graphic Texts Based on Multicategorical Relative Fusion

Mingyang Gao,Sheping Zhai,Rui Yang
DOI: https://doi.org/10.1145/3641584.3641767
2023-09-22
Abstract:Unlike the traditional unimodal sentiment analysis, multimodal sentiment analysis can jointly process features between different modalities. Existing multimodal sentiment analysis methods simply extract features from text and images ignoring the influence of emoticons that may be contained in text and scene information in images on sentiment polarity judgment. At the same time, the unimodal feature fusion stage does not fully fuse the features of each modality resulting in low accuracy of sentiment analysis. Therefore, this paper proposes a multimodal sentiment analysis model based on multicategorical relative fusion of images and texts. which separates text content and emoji in the word embedding stage in the feature extraction stage, the traditional emoji sentiment dictionary is queried to obtain their sentiment scores and attention mechanism is introduced to extract scene features and object features of images respectively; in the feature fusion stage, the emoji sentiment scores. mixed image. The final multimodal features are obtained by fusion the emoji sentiment scores. mixed image features and text features with layer normalization. The experiments on the dataset Yelp show that the improvement of the model in the feature extraction and feature fusion stages effectively improves the accuracy and F1 value of multimodal sentiment analysis in which the accuracy increases by 4.7 percentage points on the New York dataset and the F1 value reaches 85.7% on the San Francisco dataset.
Computer Science
What problem does this paper attempt to address?