Abstract:Current multi-contrast MRI super-resolution (SR) methods often harness convolutional neural networks (CNNs) for feature extraction and fusion. However, existing models have some shortcomings that prohibit them from producing more satisfactory results. First, during the feature extraction, some high-frequency details in the images are lost, resulting in blurring boundaries in the reconstructed images, which may impede the following diagnosis and treatment. Second, the perceptual field of the convolution kernel is limited, making the networks difficult to capture long-range/non-local features. Third, most of these models are solely driven by training data, neglecting prior knowledge about the correlations among different contrasts, which, once well leveraged, will effectively enhance the performance with limited training data. In this paper, we propose a novel model to synergize wavelet transforms with a new cross-attention transformer to comprehensively tackle these challenges; we call it WavTrans. Specifically, we harness one-level wavelet transformation to obtain the detail and approximation coefficients in the reference contrast MR images (Ref). While the approximation coefficients are applied to compress the low-frequency global information, the detail coefficients are utilized to represent the high-frequency local structure and texture information. Then, we propose a new residual cross-attention swin transformer to extract and fuse extracted features to establish long-distance dependencies between features and maximize the restoration of high-frequency information in Tar. In addition, a multi-residual fusion module is designed to fuse the high-frequency information in the upsampled Tar and the original Ref to ensure the restoration of detailed information. Extensive experiments demonstrate that WavTrans outperforms the SOTA methods by a considerable margin with upsampling factors of 2-fold and 4-fold. Code will be available at https://github.com/XAIMI-Lab/WavTrans .

Speech Super-Resolution Using Parallel WaveNet

Super-resolution Reconstruction Algorithms Based on Fusion of Deep Learning Mechanism and Wavelet

Wave-U-Mamba: An End-To-End Framework For High-Quality And Efficient Speech Super Resolution

Non-Parallel Voice Conversion with Autoregressive Conversion Model and Duration Adjustment

Quasi-Fully Convolutional Neural Network With Variational Inference For Speech Synthesis

Multi‐scale Audio Super Resolution Via Deep Pyramid Wavelet Convolutional Neural Network

Super Denoise Net: Speech Super Resolution with Noise Cancellation in Low Sampling Rate Noisy Environments

High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram

Parallel WaveNet: Fast High-Fidelity Speech Synthesis

Residual Attention Network for Wavelet Domain Super-Resolution

A Vocoder-free WaveNet Voice Conversion with Non-Parallel Data

Denoising Speech Based on Deep Learning and Wavelet Decomposition

Super-Resolution Image Reconstruction of Wavefront Coding Imaging System Based on Deep Learning Network

Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones

MLWAN: Multi-Scale Learning Wavelet Attention Module Network for Image Super Resolution

Wavelet-based Residual Attention Network for Image Super-Resolution

Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model

Non-Autoregressive Neural Text-to-Speech

W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision

WavTrans: Synergizing Wavelet and Cross-Attention Transformer for Multi-contrast MRI Super-Resolution

FastSpeech: Fast, Robust and Controllable Text to Speech