Controlling Surprisal in Music Generation via Information Content Curve Matching

Mathias Rose Bjare,Stefan Lattner,Gerhard Widmer
2024-08-12
Abstract:In recent years, the quality and public interest in music generation systems have grown, encouraging research into various ways to control these systems. We propose a novel method for controlling surprisal in music generation using sequence models. To achieve this goal, we define a metric called Instantaneous Information Content (IIC). The IIC serves as a proxy function for the perceived musical surprisal (as estimated from a probabilistic model) and can be calculated at any point within a music piece. This enables the comparison of surprisal across different musical content even if the musical events occur in irregular time intervals. We use beam search to generate musical material whose IIC curve closely approximates a given target IIC. We experimentally show that the IIC correlates with harmonic and rhythmic complexity and note density. The correlation decreases with the length of the musical context used for estimating the IIC. Finally, we conduct a qualitative user study to test if human listeners can identify the IIC curves that have been used as targets when generating the respective musical material. We provide code for creating IIC interpolations and IIC visualizations on <a class="link-external link-https" href="https://github.com/muthissar/iic" rel="external noopener nofollow">this https URL</a>.
Sound,Artificial Intelligence,Computation and Language,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to control the "surprisal" in the process of music generation. Specifically, the author proposes a new method to control the surprisal in music generation through information - content - curve matching. The following are the core problems and solutions in the paper: ### Core Problems 1. **Controlling Surprisal in Music Generation** - As the quality of music - generation systems improves and public interest grows, researchers begin to focus on how to control these systems. - In particular, how to control the surprisal in music generation, that is, the degree of surprise of music events to listeners. 2. **Limitations of Existing Methods** - Existing music - generation control systems are usually based on low - level features (such as pitch, tempo, harmony, etc.). Although these features can directly interpret music elements, it is difficult to capture the complexity and surprisal of music at a high level. - High - level music features such as "musical surprisal" have not been fully studied, and this feature is crucial for balancing the regularity and novelty of music. ### Solutions 1. **Introducing Instantaneous Information Content (IIC)** - A new metric - Instantaneous Information Content (IIC) is defined to quantify the information content at each time point in a music sequence. - IIC is a proxy function that estimates the musical surprisal perceived by listeners and can be calculated at any time point. 2. **Generating Music Using Beam Search** - Generate music fragments through the beam - search algorithm so that the IIC curve of the generated music fragments is as close as possible to the given target IIC curve. - This can ensure that the generated music is consistent with the target in terms of information content, thereby achieving effective control of musical surprisal. 3. **Experimental Verification** - The correlation between IIC and harmony, rhythm complexity, and note density is verified through experiments. - A user study is conducted to test whether human listeners can recognize the target IIC curve used to generate music. ### Formula Representation - **Instantaneous Information Content (IIC)** \[ IIC(t, x)=\sum_{f(i, x)<t} \lambda(t - f(i, x), i)\cdot IC(x_i|x_{<i}) \] where \(\lambda(t, i)\) is a weight function, \(f(i, x)\) is a time - location function, and \(IC(x_i|x_{<i})\) is the conditional information content. - **IIC Deviation (IC Deviation)** \[ \|IC^ * - IIC\|_1=\int_0^T|IC^*(t)-IIC(t, x)|dt \] or discretized as: \[ \|IC^ * - IIC\|_1\approx\sum_{i = 1}^m|IC^*(t_i)-IIC(t_i, x)|\Delta t \] ### Conclusion By introducing IIC and the beam - search algorithm, the paper proposes an effective method to control the surprisal in music generation. The experimental results show that this method can not only accurately match the target IIC curve when generating music but also enable human listeners to perceive the changes in musical surprisal. This provides new ideas and technical means for future music - generation research.