Do We Really Need to Drop Items with Missing Modalities in Multimodal Recommendation?

Daniele Malitesta,Emanuele Rossi,Claudio Pomo,Tommaso Di Noia,Fragkiskos D. Malliaros
2024-08-22
Abstract:Generally, items with missing modalities are dropped in multimodal recommendation. However, with this work, we question this procedure, highlighting that it would further damage the pipeline of any multimodal recommender system. First, we show that the lack of (some) modalities is, in fact, a widely-diffused phenomenon in multimodal recommendation. Second, we propose a pipeline that imputes missing multimodal features in recommendation by leveraging traditional imputation strategies in machine learning. Then, given the graph structure of the recommendation data, we also propose three more effective imputation solutions that leverage the item-item co-purchase graph and the multimodal similarities of co-interacted items. Our method can be plugged into any multimodal RSs in the literature working as an untrained pre-processing phase, showing (through extensive experiments) that any data pre-filtering is not only unnecessary but also harmful to the performance.
Information Retrieval
What problem does this paper attempt to address?
The paper attempts to address the issue of whether it is really necessary to remove items with missing modality information in multimodal recommendation systems. Typically, when items lack certain modalities (such as visual or textual information), these items are removed from the recommendation system. However, this practice may further harm the performance of the recommendation system, as it leads to further sparsity in user-item interaction data, thereby affecting the accuracy of recommendations. The paper points out that the lack of multimodal information is a common problem in practical applications, especially in scenarios like e-commerce, where it is not uncommon for product image URLs to be inaccessible or for product pages to lack descriptions and reviews. Moreover, even in academic research, this phenomenon is widespread. For example, in the Amazon Reviews dataset, many items also lack visual and textual information. Therefore, the authors pose the question: "In multimodal recommendations, do we really need to remove items with missing modality information?" Through experiments, they demonstrate that using untrained preprocessing methods to fill in these missing modality information, it is not only unnecessary to remove these items, but doing so is actually detrimental to the performance of the recommendation system. Specifically, the paper proposes a new pipeline that utilizes traditional data imputation strategies as well as more effective graph-based imputation methods to handle missing modality information in recommendation systems, thereby improving the performance of the recommendation system.