Abstract:With the development of the Internet, users can freely publish posts on various social media platforms, which offers great convenience for keeping abreast of the world. However, posts usually carry many rumors, which require plenty of manpower for monitoring. Owing to the success of modern machine learning techniques, especially deep learning models, we tried to detect rumors as a classification problem automatically. Early attempts have always focused on building classifiers relying on image or text information, i.e., single modality in posts. Thereafter, several multimodal detection approaches employ an early or late fusion operator for aggregating multiple source information. Nevertheless, they only take advantage of multimodal embeddings for fusion and ignore another important detection factor, i.e., the intermodal inconsistency between modalities. To solve this problem, we develop a novel deep visual-linguistic fusion network (DVLFN) considering cross-modal inconsistency, which detects rumors by comprehensively considering modal aggregation and contrast information. Specifically, the DVLFN first utilizes visual and textual deep encoders, i.e., Faster R-CNN and bidirectional encoder representations from transformers, to extract global and regional embeddings for image and text modalities. Then, it predicts posts' authenticity from two aspects: (1) intermodal inconsistency, which employs the Wasserstein distance to efficiently measure the similarity between regional embeddings of different modalities, and (2) modal aggregation, which experimentally employs the early fusion to aggregate two modal embeddings for prediction. Consequently, the DVLFN can compose the final prediction based on the modal fusion and inconsistency measure. Experiments are conducted on three real-world multimedia rumor detection datasets collected from Reddit, GoodNews, and Weibo. The results validate the superior performance of the proposed DVLFN.

Graph Interactive Network with Adaptive Gradient for Multi-Modal Rumor Detection.

Interpretable Graph Neural Network for Social Media Rumor Detection

Graph-aware multi-feature interacting network for explainable rumor detection on social network

GMRD

Unifying Multimodal Source and Propagation Graph for Rumour Detection on Social Media with Missing Features

Multimodal Fusion Network with Contrary Latent Topic Memory for Rumor Detection

Rumor detection on social media using hierarchically aggregated feature via graph neural networks

Rumor Detection Based on Knowledge Enhancement and Graph Attention Network

Unifying Multimodal Interactions for Rumor Diffusion Prediction with Global Hypergraph Modeling

A Rumor Detection Method Based on Multimodal Feature Fusion by a Joining Aggregation Structure

Joint learning of structural and textual information on propagation network by graph attention networks for rumor detection

DDGCN: Dual Dynamic Graph Convolutional Networks for Rumor Detection on Social Media

VGA: Vision and Graph Fused Attention Network for Rumor Detection

Heterogeneous Graph Attention Networks with Bi-directional Information Propagation for Rumor Detection

Rumor Detection with a novel graph neural network approach

Focusing on Relevant Responses for Multi-modal Rumor Detection

Deep visual-linguistic fusion network considering cross-modal inconsistency for rumor detection

Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

Region-enhanced Deep Graph Convolutional Networks for Rumor Detection

A Unified Contrastive Transfer Framework with Propagation Structure for Boosting Low-Resource Rumor Detection

Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks