Complementary or Substitutive? A Novel Deep Learning Method to Leverage Text-image Interactions for Multimodal Review Helpfulness Prediction

Shuaiyong Xiao,Gang Chen,Chenghong Zhang,Xiangge Li
DOI: https://doi.org/10.1016/j.eswa.2022.118138
IF: 8.5
2022-07-20
Expert Systems with Applications
Abstract:With the flourishing of mobile Internet, the multimodal reviews (i.e., reviews with both texts and images) are becoming prevalent and playing an important role in customer decision makings. However, when making multimodal review helpfulness prediction (MRHP), it becomes difficult due to the information interaction between text and images. The information in review text (images) can be either complementary or substitutive to visual (textual) review information. Moreover, the text (images) itself may constitute the review's diagnostic value predominantly in some cases, whereas they could be jointly perceived as useful by customers in others. In this study, we delve to conduct MRPH by modeling their text-image interactions. We proposed a novel multimodal deep learning method that exploits the complementation and substitution effects between text and images and further coordinates them for MRHP. Empirical evaluation on a large-scale online review dataset shows that our proposed method outperformed the benchmarks, indicating its powerful capability to predict the helpfulness of multimodal reviews. Exploratory analysis renders insights for understanding the complementary-substitutive interaction patterns between review text and images.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?