Abstract:Employing massive Mobile AI-Generated Content (AIGC) Service Providers (MASPs) with powerful models, high-quality AIGC services can become accessible for resource-constrained end users. However, this advancement, referred to as mobile AIGC, also introduces a significant challenge: users should download large AIGC outputs from the MASPs, leading to substantial bandwidth consumption and potential transmission failures. In this paper, we apply cross-modal Generative Semantic Communications (G-SemCom) in mobile AIGC to overcome wireless bandwidth constraints. Specifically, we utilize a series of cross-modal attention maps to indicate the correlation between user prompts and each part of AIGC outputs. In this way, the MASP can analyze the prompt context and filter the most semantically important content efficiently. Only semantic information is transmitted, with which users can recover the entire AIGC output with high quality while saving mobile bandwidth. Since the transmitted information not only preserves the semantics but also prompts the recovery, we formulate a joint semantic encoding and prompt engineering problem to optimize the bandwidth allocation among users. Particularly, we present a human-perceptual metric named Joint Perpetual Similarity and Quality (JPSQ), which is fused by two learning-based measurements regarding semantic similarity and aesthetic quality, respectively. Furthermore, we develop the Attention-aware Deep Diffusion (ADD) algorithm, which learns attention maps and leverages the diffusion process to enhance the environment exploration ability. Extensive experiments demonstrate that our proposal can reduce the bandwidth consumption of mobile users by 49.4% on average, with almost no perceptual difference in AIGC output quality. Moreover, the ADD algorithm shows superior performance over baseline DRL methods, with 1.74x higher overall reward.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to reduce bandwidth consumption while ensuring high - quality AIGC output in Mobile AI - Generated Content (Mobile AIGC). Specifically, although existing mobile AIGC methods can use powerful AIGC models to provide high - quality services for resource - constrained end - users, these services require users to download a large amount of AIGC output from Mobile AIGC Service Providers (MASP), resulting in significant bandwidth consumption and potential transmission failure problems. These problems not only affect the user experience but also increase the user's traffic cost. To solve the above problems, the paper proposes the Cross - Modal Generative Semantic Communications (G - SemCom) framework. This framework overcomes wireless bandwidth limitations in the following ways: 1. **Cross - Modal Attention Maps**: During the process of generating AIGC output, MASP uses a series of cross - modal attention maps to represent the association between user prompts and each part of the AIGC output. In this way, MASP can analyze the context of the prompts and efficiently filter out the most semantically important content. 2. **Semantic Information Transmission**: Only transmit semantic information instead of the complete AIGC output. The user - side can recover the entire AIGC output through a lightweight decoder, thus saving mobile bandwidth while maintaining high - quality output. 3. **Joint Semantic Encoding and Prompt Engineering**: In order to optimize bandwidth allocation, the paper proposes a method of joint semantic encoding and prompt engineering. This method aims to maximize semantic similarity and output quality simultaneously while saving wireless bandwidth. For this purpose, the authors define a new perceptual metric - Joint Perpetual Similarity and Quality (JPSQ), and implement the optimization through the Attention - aware Deep Diffusion (ADD) algorithm. Through experimental verification, this method can reduce bandwidth consumption by an average of 49.4% while hardly reducing the quality of AIGC output. In addition, the ADD algorithm is significantly superior to the baseline Deep Reinforcement Learning (DRL) method in terms of convergence speed and bandwidth allocation efficiency.

Cross-Modal Generative Semantic Communications for Mobile AIGC: Joint Semantic Encoding and Prompt Engineering

A Wireless AI-Generated Content (AIGC) Provisioning Framework Empowered by Semantic Communication

Agent-driven Generative Semantic Communication with Cross-Modality and Prediction

Generative Al-aided Joint Training-free Secure Semantic Communications via Multi-modal Prompts.

Harnessing the Power of AI-Generated Content for Semantic Communication

Optimizing AIGC Services by Prompt Engineering and Edge Computing: A Generative Diffusion Model-Based Contract Theory Approach

Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services

Generative AI-aided Joint Training-free Secure Semantic Communications via Multi-modal Prompts

User-Centric Interactive AI for Distributed Diffusion Model-based AI-Generated Content

Semantic Communications for Artificial Intelligence Generated Content (AIGC) Toward Effective Content Creation

Generative AI-aided Optimization for AI-Generated Content (AIGC) Services in Edge Networks

Exploring Collaborative Distributed Diffusion-Based AI-Generated Content (AIGC) in Wireless Networks

Offloading and Quality Control for AI Generated Content Services in 6G Mobile Edge Computing Networks

Optimizing Mobile-Edge AI-Generated Everything (AIGX) Services by Prompt Engineering: Fundamental, Framework, and Case Study

Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints

Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

Large Generative Model Assisted 3D Semantic Communication

Cross-modal Semantic Communications in 6G

Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services

Scalable AI Generative Content for Vehicular Network Semantic Communication