FLAMO: An Open-Source Library for Frequency-Domain Differentiable Audio Processing

Gloria Dal Santo,Gian Marco De Bortoli,Karolina Prawda,Sebastian J. Schlecht,Vesa Välimäki
2024-09-13
Abstract:We present FLAMO, a Frequency-sampling Library for Audio-Module Optimization designed to implement and optimize differentiable linear time-invariant audio systems. The library is open-source and built on the frequency-sampling filter design method, allowing for the creation of differentiable modules that can be used stand-alone or within the computation graph of neural networks, simplifying the development of differentiable audio systems. It includes predefined filtering modules and auxiliary classes for constructing, training, and logging the optimized systems, all accessible through an intuitive interface. Practical application of these modules is demonstrated through two case studies: the optimization of an artificial reverberator and an active acoustics system for improved response smoothness.
Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to develop an open - source library for the implementation and optimization of differentiable audio processing systems in the frequency domain. Specifically, FLAMO (Frequency - Sampling Library for Audio - Module Optimization) aims to simplify the creation and optimization of differentiable modules of linear time - invariant (LTI) audio systems. These modules can be used independently or embedded in the computational graph of a neural network, thus simplifying the development of differentiable audio systems. ### Main Problems and Solutions 1. **Differentiable Audio Processing in the Frequency Domain**: - **Problem**: Traditional time - domain methods have problems such as vanishing/exploding gradients, high memory cost, and slow training speed when dealing with IIR filters. - **Solution**: Approximate the IIR filter as an FIR filter by the frequency - sampling method, and decompose the frequency response into the product of multiple processing units by using the convolution operation, thus avoiding the above problems. 2. **Time Aliasing Problem**: - **Problem**: When performing the inverse discrete Fourier transform (IDFT), if the effective duration of the impulse response exceeds the Fourier transform length, it will lead to the time - aliasing phenomenon. - **Solution**: An anti - aliasing method is proposed. By applying an exponential decay envelope and compensating before the frequency - domain conversion, the influence of time aliasing is effectively reduced. 3. **System Stability and Parameter Mapping**: - **Problem**: Ensure the stability of the IIR filter and map the original filter parameters to the target distribution. - **Solution**: By introducing a specific parameter mapping function, ensure system stability and effective parameter mapping. 4. **Practical Application Scenarios**: - **Problem**: Verify the effectiveness of the FLAMO library in practical audio processing tasks. - **Solution**: The application effects of FLAMO are shown through two case studies: the optimization of an artificial reverb and the optimization of an active acoustic system to improve the response smoothness. ### Formula Display - **Sampling of Frequency Response**: \[ z_M=\left[e^{j\pi\cdot0/(M - 1)},e^{j\pi\cdot1/(M - 1)},\ldots,e^{j\pi\cdot(M - 2)/(M - 1)},e^{j\pi}\right] \] where \(M\) represents the number of frequency points. - **Frequency Response of IIR Filter**: \[ H(z_M)=\frac{\text{FFT}(b)}{\text{FFT}(a)}=\frac{b_0 + b_1z_M^{-1}+\cdots + b_Nz_M^{-N + 1}}{a_0 + a_1z_M^{-1}+\cdots + a_Nz_M^{-N + 1}} \] where \(b\) and \(a\) are the forward and feedback coefficients of the filter respectively. - **Frequency Response of Cascade System**: \[ H_{\text{series}}(z_M)=H_1(z_M)\cdot H_2(z_M) \] - **Frequency Response of Recursive System**: \[ H_{\text{recursion}}(z_M)=(I - G(z_M)F(z_M))^{-1}G(z_M) \] - **Exponential Decay Envelope**: \[ \hat{h}[n]=h[n]\cdot\gamma^n \] \[ h[n]=\hat{h}[n]\cdot\gamma^{-n}=\text{IDFT}(\hat{H}(e^{j\omega}))\cdot\gamma^{-n} \] Through these methods and formulas, the FLAMO library can efficiently implement and optimize differentiable in the frequency domain.