Abstract:Graph neural networks (GNNs) have shown great potential for personalized recommendation. At the core is to reorganize interaction data as a user-item bipartite graph and exploit high-order connectivity among user and item nodes to enrich their representations. While achieving great success, most existing works consider interaction graph based only on ID information, foregoing item contents from multiple modalities (e.g., visual, acoustic, and textual features of micro-video items). Distinguishing personal interests on different modalities at a granular level was not explored until recently proposed MMGCN (Wei et al., 2019). However, it simply employs GNNs on parallel interaction graphs and treats information propagated from all neighbors equally, failing to capture user preference adaptively. Hence, the obtained representations might preserve redundant, even noisy information, leading to non-robustness and suboptimal performance. In this work, we aim to investigate how to adopt GNNs on multimodal interaction graphs, to adaptively capture user preference on different modalities and offer in-depth analysis on why an item is suitable to a user. Towards this end, we propose a new Multimodal Graph Attention Network, short for MGAT, which disentangles personal interests at the granularity of modality. In particular, built upon multimodal interaction graphs, MGAT conducts information propagation within individual graphs, while leveraging the gated attention mechanism to identify varying importance scores of different modalities to user preference. As such, it is able to capture more complex interaction patterns hidden in user behaviors and provide a more accurate recommendation. Empirical results on two micro-video recommendation datasets, Tiktok and MovieLens, show that MGAT exhibits substantial improvements over the state-of-the-art baselines like NGCF (Wang, He, et al., 2019) and MMGCN (Wei et al., 2019). Further analysis on a case study illustrates how MGAT generates attentive information flow over multimodal interaction graphs.

SGAT: Scene Graph Attention Network for Video Recommendation

MGAT: Multimodal Graph Attention Network for Recommendation.

SceneRec: Scene-Based Graph Neural Networks for Recommender Systems

Gated Hypergraph Neural Network for Scene-Aware Recommendation

Contextualized Graph Attention Network for Recommendation with Item Knowledge Graph

Fast Contextual Scene Graph Generation with Unbiased Context Augmentation.

A Novel Neighborhood-Augmented Graph Attention Network for Sequential Recommendation.

Graph-enhanced and collaborative attention networks for session-based recommendation

Attentive Sequential Model Based on Graph Neural Network for Next Poi Recommendation.

SGNNRec: A Scalable Double-Layer Attention-Based Graph Neural Network Recommendation Model

Set-Sequence-Graph: A Multi-View Approach Towards Exploiting Reviews for Recommendation.

Sequence Recommendation Based on Interactive Graph Attention Network.

Composition-Enhanced Graph Collaborative Filtering for Multi-behavior Recommendation

Graph-Augmented Co-Attention Model for Socio-Sequential Recommendation

Graph Contextualized Self-Attention Network for Session-based Recommendation

Graph convolutional network and self-attentive for sequential recommendation

Kgat: Knowledge Graph Attention Network For Recommendation

Attention-Based Recommendation On Graphs

SSGCL: Simple Social Recommendation with Graph Contrastive Learning

NGAT4Rec: Neighbor-Aware Graph Attention Network For Recommendation

Sequence-Aware Graph Neural Network for Session-based Recommendation