Abstract:Multi-baseline Synthetic Aperture Radar (SAR) three-dimensional (3D) tomography is a crucial remote sensing technique that provides 3D resolution unavailable in conventional SAR imaging. However, achieving high-quality imaging typically requires multi-angle or full-aperture data, resulting in significant imaging costs. Recent advancements in sparse 3D SAR, which rely on data from limited apertures, have gained attention as a cost-effective alternative. Notably, deep learning techniques have markedly enhanced the imaging quality of sparse 3D SAR. Despite these advancements, existing methods primarily depend on high-resolution radar images for supervising the training of deep neural networks (DNNs). This exclusive dependence on single-modal data prevents the introduction of complementary information from other data sources, limiting further improvements in imaging performance. In this paper, we introduce a Cross-Modal 3D-SAR Reconstruction Network (CMAR-Net) to enhance 3D SAR imaging by integrating heterogeneous information. Leveraging cross-modal supervision from 2D optical images and error transfer guaranteed by differentiable rendering, CMAR-Net achieves efficient training and reconstructs highly sparse multi-baseline SAR data into visually structured and accurate 3D images, particularly for vehicle targets. Extensive experiments on simulated and real-world datasets demonstrate that CMAR-Net significantly outperforms SOTA sparse reconstruction algorithms based on compressed sensing (CS) and deep learning (DL). Furthermore, our method eliminates the need for time-consuming full-aperture data preprocessing and relies solely on computer-rendered optical images, significantly reducing dataset construction costs. This work highlights the potential of deep learning for multi-baseline SAR 3D imaging and introduces a novel framework for radar imaging research through cross-modal learning.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in multi - baseline Synthetic Aperture Radar (SAR) three - dimensional (3D) imaging, how to use sparse multi - baseline data to achieve high - quality 3D SAR reconstruction. Specifically, existing methods rely on high - resolution radar images to supervise the training of Deep Neural Networks (DNNs), which limits the ability to further improve imaging performance and requires a large amount of time and resources for full - aperture data pre - processing. In addition, due to the limitations of single - modal data, it is difficult for existing methods to introduce complementary information from other data sources. To solve these problems, this paper proposes a Cross - Modal 3D SAR Reconstruction Network (CMAR - Net), which significantly improves the quality of 3D SAR imaging by integrating heterogeneous information and using 2D optical images for cross - modal supervision. The following are the main contributions of this paper: 1. **Introduction of cross - modal learning**: For the first time, the concept of cross - modal learning is introduced into SAR 3D reconstruction. Through 2D optical image supervision, the inherent resolution limitations of electromagnetic images are overcome. 2. **Proposal of CMAR - Net**: A network that combines a unique data augmentation strategy and a projection - back - projection module is designed, which enhances robustness and generalization ability. It is trained only with simulated data and can achieve good results on real data without fine - tuning. 3. **Simplification of dataset construction**: Only 2D optical images are required as supervision data, eliminating the need for high - resolution full - aperture data and reducing the cost and complexity of dataset construction. 4. **Excellent experimental results**: Under low Signal - to - Noise Ratio (SNR) and highly sparse angular measurement conditions, CMAR - Net can still significantly improve the quality of 3D target reconstruction. The Peak Signal - to - Noise Ratio (PSNR) is improved by an average of 75.83% and the Structural Similarity Index (SSIM) is improved by 47.85%. ### Formula summary To ensure the correctness and readability of the formulas, the following are some key formulas involved in the paper: - **Volume rendering formula**: \[ C(r)=\int_{t_{n}}^{t_{f}}T(t)\cdot\sigma(r(t))dt \] where \(\sigma(r(t))\) represents the volume density at point \(r(t)\) along the camera ray, and \(T(t)\) is the cumulative transmittance, defined as: \[ T(t)=\exp\left(-\int_{t}^{t_{1}}\sigma(r(u))du\right) \] - **Discrete integral estimation**: \[ \hat{C}(r)=\sum_{i = 1}^{N}T_{i}(1-\exp(-\sigma_{i}\delta_{i})) \] where \[ T_{i}=\exp\left(-\sum_{j = 1}^{i - 1}\sigma_{j}\delta_{i}\right) \] \(\delta_{i}=t_{i + 1}-t_{i}\) represents the distance between adjacent samples. - **Huber loss function**: \[ L_{\text{huber}}(I_{g}^{i},I_{r}^{i},\gamma)= \begin{cases} \frac{1}{2}\left\|I_{g}^{i}-I_{r}^{i}\right\|^{2}&\text{if }\left\|I_{g}^{i}-I_{r}^{i}\right\|\leq\gamma\\ \gamma\left\|I_{g}^{i}-I_{r}^{i}\right\|-\frac{1}{2}\gamma^{2}&\text{otherwise} \end{cases} \] The total loss function is defined as: \[ L=\frac{1}{V}\sum_{i = 0}^{V - 1}L_{\text{huber}}(I_{g}^{i},I_{r}^{i},\gamma) \] Through these improvements, CMAR - Net not only improves the quality of 3D SAR reconstruction but also simplifies the dataset construction process, demonstrating the great potential of cross - modal learning in SAR imaging.

CMAR-Net: Accurate Cross-Modal 3D SAR Reconstruction of Vehicle Targets with Sparse Multi-Baseline Data

SAR Fast Target Imaging in Sparse Field Based on AlexNet

Profiling the malaria genome: a gene survey of three species of malaria parasite with comparison to other apicomplexan species.

Lightweight Pixel2mesh for 3D Target Reconstruction from a Single SAR Image

Learning-Based Sparse Recovery Algorithm for 3D SAR Imaging

SAR Parametric Super-Resolution Image Reconstruction Methods Based on ADMM and Deep Neural Network

Fast Super-resolution 3D SAR Imaging Using an Unfolded Deep Network

SAR Nonsparse Scene Reconstruction Network via Image Feature Representation Learning

Radargrammetric 3D Imaging through Composite Registration Method Using Multi-Aspect Synthetic Aperture Radar Imagery

ISAR-NeRF: Neural Radiance Fields for 3D Imaging of Space Target from Multi-view ISAR Images

DeepRED Based Sparse SAR Imaging

Nonsparse SAR Scene Imaging Network Based on Sparse Representation and Approximate Observations

Reconstruction of synthetic aperture radar data using hybrid compressive sensing and deep neural network algorithm

Deep SAR Tomography: A Model-Inspired Approach With Learned Sparse Regularizer

Circular SAR Incoherent 3D Imaging with a NeRF-Inspired Method

SAR-NeRF: Neural Radiance Fields for Synthetic Aperture Radar Multi-View Representation

A Robust Super-resolution Gridless Imaging Framework for UAV-borne SAR Tomography

SAR Image Generation by Integrating Differentiable SAR Renderer with Neural Networks

MAda-Net: Model-Adaptive Deep Learning Imaging for SAR Tomography

Deep Learning-Based Multiband Signal Fusion for 3-D SAR Super-Resolution