Semantic Communication based on Large Language Model for Underwater Image Transmission

Weilong Chen,Wenxuan Xu,Haoran Chen,Xinran Zhang,Zhijin Qin,Yanru Zhang,Zhu Han
2024-08-26
Abstract:Underwater communication is essential for environmental monitoring, marine biology research, and underwater exploration. Traditional underwater communication faces limitations like low bandwidth, high latency, and susceptibility to noise, while semantic communication (SC) offers a promising solution by focusing on the exchange of semantics rather than symbols or bits. However, SC encounters challenges in underwater environments, including semantic information mismatch and difficulties in accurately identifying and transmitting critical information that aligns with the diverse requirements of underwater applications. To address these challenges, we propose a novel Semantic Communication (SC) framework based on Large Language Models (LLMs). Our framework leverages visual LLMs to perform semantic compression and prioritization of underwater image data according to the query from users. By identifying and encoding key semantic elements within the images, the system selectively transmits high-priority information while applying higher compression rates to less critical regions. On the receiver side, an LLM-based recovery mechanism, along with Global Vision ControlNet and Key Region ControlNet networks, aids in reconstructing the images, thereby enhancing communication efficiency and robustness. Our framework reduces the overall data size to 0.8\% of the original. Experimental results demonstrate that our method significantly outperforms existing approaches, ensuring high-quality, semantically accurate image reconstruction.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the problem of efficiently and reliably transmitting image data in underwater environments. Traditional underwater communication technologies (such as acoustic communication) face challenges like limited bandwidth, high latency, and susceptibility to noise interference. These issues limit the effective transmission of multimodal data (such as images, videos, and sensor data) in underwater environments. To overcome these challenges, the paper proposes a semantic communication (SC) framework based on large language models (LLMs), aiming to improve the transmission efficiency and quality of underwater image data through semantic compression and prioritization. Specifically, the proposed method includes the following aspects: 1. **Semantic Compression and Prioritization**: Utilizing visual LLMs to perform semantic compression and prioritization on underwater image data. This involves identifying and encoding key semantic elements in the images based on user queries and selectively transmitting high-priority information, while applying higher compression rates to less important areas. 2. **Recovery Mechanism at the Receiving End**: At the receiving end, using an LLM-based recovery mechanism along with a Global Vision ControlNet and a Key Region ControlNet to reconstruct the images, thereby enhancing the efficiency and robustness of communication. 3. **Experimental Validation**: Experimental results show that the proposed method can compress the data size to 0.8% of the original data and still maintain high-quality image reconstruction under high noise conditions. Through these innovations, the paper aims to address the shortcomings of traditional underwater communication technologies in terms of bandwidth, latency, and robustness, providing a new solution for efficient data transmission in underwater environments.