MusicGen-Chord: Advancing Music Generation through Chord Progressions and Interactive Web-UI

Jongmin Jung,Andreas Jansson,Dasaem Jeong
2024-11-30
Abstract:MusicGen is a music generation language model (LM) that can be conditioned on textual descriptions and melodic features. We introduce MusicGen-Chord, which extends this capability by incorporating chord progression features. This model modifies one-hot encoded melody chroma vectors into multi-hot encoded chord chroma vectors, enabling the generation of music that reflects both chord progressions and textual descriptions. Furthermore, we developed MusicGen-Remixer, an application utilizing MusicGen-Chord to generate remixes of input music conditioned on textual descriptions. Both models are integrated into Replicate's web-UI using cog, facilitating broad accessibility and user-friendly controllable interaction for creating and experiencing AI-generated music.
Sound,Artificial Intelligence,Machine Learning,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to enhance the controllability and expressiveness of music generation models by introducing chord - progression features. Specifically, the paper presents a new model named MusicGen - Chord, which extends the existing MusicGen model, enabling music generation to be conditional not only on text descriptions and melodic features but also on chord - progression features. ### Main Problems and Solutions 1. **Limitations of Existing Models**: - Existing music generation models (such as MusicGen) mainly rely on text descriptions and melodic features, which have certain limitations when generating music with complex harmonic structures. - The melodic pitch vectors using one - hot encoding alone cannot fully capture complex harmonic content. 2. **Introducing Chord - Progression Features**: - To solve the above problems, MusicGen - Chord introduces multi - hot - encoded chord pitch vectors to represent chord - progression features more comprehensively. - This method allows the model to generate music consistent with the input chord progressions, thus improving the harmony and diversity of the generated music. 3. **User Interaction and Application Development**: - The paper also introduces the MusicGen - Remixer application, which utilizes the functions of MusicGen - Chord, enabling users to upload music clips and provide text descriptions, generate new background music and mix it with the original audio, creating personalized remix versions. - Through Replicate's web interface and the cog package, these models and applications are integrated into the cloud, providing a broad and easy - to - use platform, promoting the creation and experience of AI - generated music. ### Formula Representation In the paper, the encoding methods involved can be represented by the following formulas: - **One - Hot Encoding**: \[ \mathbf{v}_t=\begin{cases} 1 & \text{if pitch class }p\text{ is active at time }t\\ 0 & \text{otherwise} \end{cases} \] - **Multi - Hot Encoding**: \[ \mathbf{v}_t = \sum_{p\in P_t}\mathbf{e}_p \] where \(P_t\) is the set of all pitch categories activated at time \(t\), and \(\mathbf{e}_p\) is the unit vector corresponding to the pitch category. In this way, MusicGen - Chord can more accurately capture the complexity of chord progressions, thereby generating more rich and diverse musical works.