Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy

Weijian Mai,Jian Zhang,Pengfei Fang,Zhijun Zhang

2024-01-03

Abstract:In the era of Artificial Intelligence Generated Content (AIGC), conditional multimodal synthesis technologies (e.g., text-to-image, text-to-video, text-to-audio, etc) are gradually reshaping the natural content in the real world. The key to multimodal synthesis technology is to establish the mapping relationship between different modalities. Brain signals, serving as potential reflections of how the brain interprets external information, exhibit a distinctive One-to-Many correspondence with various external modalities. This correspondence makes brain signals emerge as a promising guiding condition for multimodal content synthesis. Brian-conditional multimodal synthesis refers to decoding brain signals back to perceptual experience, which is crucial for developing practical brain-computer interface systems and unraveling complex mechanisms underlying how the brain perceives and comprehends external stimuli. This survey comprehensively examines the emerging field of AIGC-based Brain-conditional Multimodal Synthesis, termed AIGC-Brain, to delineate the current landscape and future directions. To begin, related brain neuroimaging datasets, functional brain regions, and mainstream generative models are introduced as the foundation of AIGC-Brain decoding and analysis. Next, we provide a comprehensive taxonomy for AIGC-Brain decoding models and present task-specific representative work and detailed implementation strategies to facilitate comparison and in-depth analysis. Quality assessments are then introduced for both qualitative and quantitative evaluation. Finally, this survey explores insights gained, providing current challenges and outlining prospects of AIGC-Brain. Being the inaugural survey in this domain, this paper paves the way for the progress of AIGC-Brain research, offering a foundational overview to guide future work.

Artificial Intelligence

What problem does this paper attempt to address?

The problem this paper attempts to address is decoding brain signals through multimodal synthesis technology under brain signal conditions (AIGC-Brain) to restore them to perceptual experiences. Specifically, the paper focuses on how to use brain signals as conditions to generate multimodal content corresponding to visual, auditory, and semantic text. This technology is of great significance for developing practical brain-computer interface systems and revealing how the brain perceives and understands the external world. The main contributions of the paper include: 1. **Basic Research**: A detailed summary of relevant brain neuroimaging datasets, functional brain regions, and mainstream generative models, laying the foundation for the decoding and analysis of AIGC-Brain. 2. **Method Classification**: A systematic classification of AIGC-Brain decoding models, emphasizing the workflow, intrinsic mapping relationships, and advantages and disadvantages of each method, clarifying the current methodological landscape. 3. **Task-Specific Implementation**: Demonstrating detailed implementation strategies for various AIGC-Brain tasks, providing representative works and specific implementation pipelines, facilitating comparison and in-depth analysis of technical trends, task-specific features, and preferences. 4. **Quality Evaluation**: An overview of task-specific quality evaluation methods in AIGC-Brain, facilitating the assessment of synthesis results from both qualitative and quantitative perspectives. 5. **Comprehensive Insights**: Summarizing the current challenges faced by AIGC-Brain and providing profound insights into future research directions. Through these contributions, the paper provides important guidance and foundation for further research and development in the field of AIGC-Brain.

Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy

AI-Generated Content (AIGC) for Various Data Modalities: A Survey

A Survey of Cross-Modality Brain Image Synthesis

Multimodal Image Synthesis and Editing: The Generative AI Era

A Brain-Inspired In-Memory Computing System for Neuronal Communication Via Memristive Circuits.

Artificial Intelligence Based Multimodal Language Decoding from Brain Activity: A Review

A world survey of artificial brain projects, Part I: Large-scale brain simulations

Multimodal Contrastive Learning for Brain-Machine Fusion: from Brain-in-the-loop Modeling to Brain-out-of-the-loop Application

Brain-Inspired Computing: A Systematic Survey and Future Trends

Multi-modal Cognitive Computing

A Survey on Audio Synthesis and Audio-Visual Multimodal Processing

A survey on multimodal-guided visual content synthesis

A World Survey of Artificial Brain Projects, Part II: Biologically Inspired Cognitive Architectures

Multi-Constraint Transferable Generative Adversarial Networks for Cross-Modal Brain Image Synthesis

Generative AI for brain image computing and brain network computing: a review

Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey)

Cross-Modality Neuroimage Synthesis: A Survey

Guest Editorial Multimodal Modeling and Analysis Informed by Brain Imaging—Part II

Multimodal foundation models are better simulators of the human brain

Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex

Brain-inspired Multimodal Learning Based on Neural Networks