Abstract:In this article, we propose a new source separation method in which the dual-tree complex wavelet transform (DTCWT) and short-time Fourier transform (STFT) algorithms are used sequentially as dual transforms and sparse nonnegative matrix factorization (SNMF) is used to factorize the magnitude spectrum. STFT-based source separation faces issues related to time and frequency resolution because it cannot exactly determine which frequencies exist at what time. Discrete wavelet transform (DWT)-based source separation faces a time-variation-related problem (i.e., a small shift in the time-domain signal causes significant variation in the energy of the wavelet coefficients). To address these issues, we utilize the DTCWT, which comprises two-level trees with different sets of filters and provides additional information for analysis and approximate shift invariance; these properties enable the perfect reconstruction of the time-domain signal. Thus, the time-domain signal is transformed into a set of subband signals in which low- and high-frequency components are isolated. Next, each subband is passed through the STFT and a complex spectrogram is constructed. Then, SNMF is applied to decompose the magnitude part into a weighted linear combination of the trained basis vectors for both sources. Finally, the estimated signals can be obtained through a subband binary ratio mask by applying the inverse STFT (ISTFT) and the inverse DTCWT (IDTCWT). The proposed method is examined on speech separation tasks utilizing the GRID audiovisual and TIMIT corpora. The experimental findings indicate that the proposed approach outperforms the existing methods.

Nearest Neighbor Search-Based Bitwise Source Separation Using Discriminant Winner-Take-All Hashing

Blind Source Separation Algorithm Based on Wavelet Denoising

Dual-Transform Source Separation Using Sparse Nonnegative Matrix Factorization.

Online Noisy Single-Channel Source Separation Using Adaptive Spectrum Amplitude Estimator and Masking.

Single Channel Audio Source Separation

Sound Source Separation Using Latent Variational Block-Wise Disentanglement

Audio query-based music source separation

Blind Separation and Extraction of Binary Sources.

Single Channel Blind Source Separation Using the Best Characteristic Basis

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

Unsupervised Single-Channel Separation of Nonstationary Signals Using Gammatone Filterbank and Itakura–Saito Nonnegative Matrix Two-Dimensional Factorizations

A Blind Separation Algorithm of Speech Mixtures Base on Time-Frequency Masking

Data-Driven Source Separation Based on Simplex Analysis

Single-Channel Blind Separation Using Pseudo-Stereo Mixture and Complex 2-D Histogram.

Multichannel audio signal source separation based on an Interchannel Loudness Vector Sum

A Novel Permutation Algorithm in Frequency-Domain Blind Source Separation

Score-based Source Separation with Applications to Digital Communication Signals

Quasi-Blind Source Separation Algorithm for Convolutive Mixture of Speech

A Blind Source Separation Algorithm by Using Time-Frequency Signal Analysis

Robust Source Separation with Simple One-Source-Active Detection

Unsupervised Learning For Monaural Source Separation Using Maximization-Minimization Algorithm With Time-Frequency Deconvolution