Abstract:Advances in multimedia data acquisition and storage technology have led to the growth of very large multimedia databases. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging problem. This challenge has opened the opportunity for research in Multimedia Data Mining (MDM). Multimedia data mining can be defined as the process of finding interesting patterns from media data such as audio, video, image and text that are not ordinarily accessible by basic queries and associated results. The motivation for doing MDM is to use the discovered patterns to improve decision making. MDM has therefore attracted significant research efforts in developing methods and tools to organize, manage, search and perform domain specific tasks for data from domains such as surveillance, meetings, broadcast news, sports, archives, movies, medical data, as well as personal and online media collections. This paper presents a survey on the problems and solutions in Multimedia Data Mining, approached from the following angles: feature extraction, transformation and representation techniques, data mining techniques, and current multimedia data mining systems in various application domains. We discuss main aspects of feature extraction, transformation and representation techniques. These aspects are: level of feature extraction, feature fusion, features synchronization, feature correlation discovery and accurate representation of multimedia data. Comparison of MDM techniques with state of the art video processing, audio processing and image processing techniques is also provided. Similarly, we compare MDM techniques with the state of the art data mining techniques involving clustering, classification, sequence pattern mining, association rule mining and visualization. We review current multimedia data mining systems in detail, grouping them according to problem formulations and approaches. The review includes supervised and unsupervised discovery of events and actions from one or more continuous sequences. We also do a detailed analysis to understand what has been achieved and what are the remaining gaps where future research efforts could be focussed. We then conclude this survey with a look at open research directions.

Multimodal Data Mining in a Multimedia Database Based on Structured Max Margin Learning

Multimodal Data Mining in a Multimedia Database Based on Structured Max Margin Learning

Discrete Cross-Modal Hashing for Efficient Multimedia Retrieval

A Highly Scalable and Adaptable Co-Learning Framework on Multimodal Data Mining in a Multimedia Database

Multi-modal Deep Analysis for Multimedia

Latent Structure Mining With Contrastive Modality Fusion for Multimedia Recommendation

Multimodal association mining for personalized image browsing

One for more: Structured Multi-Modal Hashing for multiple multimedia retrieval tasks

Proceedings of the Twelfth International Workshop on Multimedia Data Mining

Multimedia data mining: state of the art and challenges

Enhancing Multimodal Information Retrieval Through Integrating Data Mining and Deep Learning Techniques

Enhanced Discrete Multi-Modal Hashing: More Constraints Yet Less Time to Learn

Understanding Multimedia Document Semantics for Cross-Media Retrieval

Heterogeneous multimedia data semantics mining using content and location context.

Extracting Multimedia Semantics Based On Independent Modality Discovering And Fusion

An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

Multimodal Composition Example Mining for Composed Query Image Retrieval

Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization

Deep Mamba Multi-modal Learning

Improving Web-Based Learning: Automatic Annotation of Multimedia Semantics and Cross-Media Indexing

Multimodal Neural Databases