Why some audio signal short-time Fourier transform coefficients have nonuniform phase distributions

Stephen D. Voran
2024-09-14
Abstract:The short-time Fourier transform (STFT) represents a window of audio samples as a set of complex coefficients. These are advantageously viewed as magnitudes and phases and the overall distribution of phases is very often assumed to be uniform. We show that when audio signal STFT phase distributions are analyzed per-frequency or per-magnitude range, they can be far from uniform. That is, the uniform phase distribution assumption obscures significant important details. We explain the significance of the nonuniform phase distributions and how they might be exploited, derive their source, and explain why the choice of the STFT window shape influences the nonuniformity of the resulting phase distributions.
Audio and Speech Processing,Sound,Signal Processing
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: whether the phase distribution of short - time Fourier transform (STFT) coefficients is always uniformly distributed, and if the phase distribution is non - uniform, what are the impacts and reasons of these non - uniformities on audio signal processing. ### Problem Background The traditional assumption is that the STFT coefficient phases of audio signals are uniformly distributed. However, through analysis, the author found that in some frequency or amplitude ranges, the phase distribution of STFT coefficients may be far from uniform. This phenomenon of non - uniform phase distribution is of great significance in practical applications, especially in fields involving audio signal enhancement, separation and classification. ### Research Motivation 1. **Limitations of Existing Assumptions**: Many studies and applications are based on the assumption of uniform phase distribution, but this assumption may mask important details, especially in cases where accurate signal reconstruction or optimization of quantization strategies are required. 2. **Practical Requirements**: Understanding and utilizing these non - uniform phase distributions can help improve audio signal processing algorithms, such as noise suppression, speech enhancement, etc. ### Main Contributions of the Paper 1. **Revealing Non - uniform Phase Distribution**: The paper shows the non - uniformities in the STFT coefficient phase distribution of different types of audio signals (such as speech, live music, sound effects, etc.), and this phenomenon is related to the window shape. 2. **Explaining the Reasons for Non - uniformities**: Through theoretical derivation, the author explains why the pure - tone components in audio signals will lead to a non - linear mapping of STFT phase distribution, thus resulting in non - uniform distribution. 3. **Practical Significance**: Non - uniform phase distribution is not only perceptually significant (for example, the difference can be clearly heard after adding phase noise), but also has important mathematical impacts. For example, a quantizer optimized with PDF can more effectively reduce quantization error. ### Key Formulas - STFT Definition: \[ X_k=\sum_{i = 0}^{N - 1}w_ix(t\cdot N_s + i)e^{-j2\pi ki/N} \] where \(w_i\) is the window value and \(x(t\cdot N_s + i)\) is the time - domain sample. - Phase Calculation: \[ \phi_k=\angle(X_k)=\arctan\left(\frac{\Im(X_k)}{\Re(X_k)}\right) \] - Non - linear Phase Relationship: \[ \phi_k=\arctan\left(\frac{c_\Re\cos(\theta+\zeta_\Re)}{c_\Im\cos(\theta+\zeta_\Im)}\right) \] where \(c_\Re, c_\Im, \zeta_\Re, \zeta_\Im\) are determined by frequency differences and window parameters. ### Conclusion The paper proves through experiments and theoretical analysis that the phase distribution of STFT coefficients of audio signals is not always uniform, and explains the mechanism of this non - uniformity and its important significance in practical applications. This provides a new perspective and improvement direction for future audio signal processing.