Music De-limiter Networks via Sample-wise Gain Inversion

Chang-Bin Jeon,Kyogu Lee
2024-06-23
Abstract:The loudness war, an ongoing phenomenon in the music industry characterized by the increasing final loudness of music while reducing its dynamic range, has been a controversial topic for decades. Music mastering engineers have used limiters to heavily compress and make music louder, which can induce ear fatigue and hearing loss in listeners. In this paper, we introduce music de-limiter networks that estimate uncompressed music from heavily compressed signals. Inspired by the principle of a limiter, which performs sample-wise gain reduction of a given signal, we propose the framework of sample-wise gain inversion (SGI). We also present the musdb-XL-train dataset, consisting of 300k segments created by applying a commercial limiter plug-in for training real-world friendly de-limiter networks. Our proposed de-limiter network achieves excellent performance with a scale-invariant source-to-distortion ratio (SI-SDR) of 24.0 dB in reconstructing musdb-HQ from musdb-XL data, a limiter-applied version of musdb-HQ. The training data, codes, and model weights are available in our repository (<a class="link-external link-https" href="https://github.com/jeonchangbin49/De-limiter" rel="external noopener nofollow">this https URL</a>).
Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the phenomenon of "Loudness War" in the music industry. Specifically, in order to make music sound louder, many music producers and engineers use limiters to heavily compress audio signals. Although this increases the final loudness, it also reduces the dynamic range of music and may cause problems such as auditory fatigue and hearing damage. To solve this problem, the author proposes a music de - limiter networks based on the Sample - wise Gain Inversion (SGI) framework. This framework aims to restore the original uncompressed state from heavily compressed music signals, thereby improving sound quality and protecting the hearing health of listeners. In addition, the author also constructs a dataset named musdb - XL - train for training this de - limiter network to meet the needs in practical applications. Through this method, the author hopes to not only enhance the listening experience of listeners but also provide more creative freedom for music creators, because they can use audio materials closer to the original state for sampling and creation.