Abstract:With the development of the Mobile Internet, more and more people create and release multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although many current works focus on constructing models extracting abstract features from the content of each post, they neglect the intrinsic semantic architecture such as latent topics, etc. These models only learn patterns in content coupled with certain specific latent topics on the training set to distinguish real and fake posts, which will suffer generalization and discriminating ability decline, especially when posts are associated with rare or new topics. Moreover, most existing works using deep schemes to extract and integrate textual and visual representation in post have not effectively modeled and sufficiently utilized the complementary and noisy multi-modal information containing semantic concepts and entities to complement and enhance each modal. In this paper, to deal with the above problems, we propose a novel end-to-end Multi-modal Topic Memory Network (MTMN), which obtains and combines post representations shared across latent topics together with global features of latent topics while modeling intra-modality and inter-modality information in a unified framework. (1) To tackle real scenarios where newly arriving posts with different topic distribution from the training data, our method incorporates a topic memory module to explicitly characterize final representation as post feature shared across topics and global features of latent topics. These two kinds of features are jointly learned and then combined to generate robust representation. (2) To effectively integrate multi-modality information in posts, we propose a novel blended attention module for multi-modal fusion, which can simultaneously exploit the intra-modality relation within each modal and the inter-modality relation between text words and image regions to complement and enhance each other fo- high-quality representation. Extensive experiments on two public real-world datasets demonstrate the superior performance of MTMN compared with other state-of-the-art algorithms.

Memory transformation networks for weakly supervised visual classification

Fast Real-Time Video Object Segmentation with a Tangled Memory Network

Enhancing video anomaly detection with learnable memory network: A new approach to memory-based auto-encoders

Compound Memory Networks for Few-Shot Video Classification

A novel spatio-temporal memory network for video anomaly detection

Multimodal Memory Modelling for Video Captioning

Attention-Driven Memory Network for Online Visual Tracking.

Towards flexible perception with visual memory

Memory-Based Neighbourhood Embedding for Visual Recognition

Memory Matching Networks for One-Shot Image Recognition

Memory-Guided Semantic Learning Network for Temporal Sentence Grounding

Memory Fusion Network for Multi-view Sequential Learning

Memory-Augmented Relation Network for Few-Shot Learning

ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization

Few-shot activity recognition with cross-modal memory network

Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

Memory-Augmented Spatial-Temporal Consistency Network for Video Anomaly Detection.

A Visual Memory Information Interaction Mechanism Inspired Graph Neural Network for Highly Imbalanced Image Classification

Fake News Detection via Multi-Modal Topic Memory Network

Label Independent Memory for Semi-Supervised Few-shot Video Classification