Prevailing Research Areas for Music AI in the Era of Foundation Models

Megan Wei,Mateusz Modrzejewski,Aswin Sivaraman,Dorien Herremans
2024-09-14
Abstract:In tandem with the recent advancements in foundation model research, there has been a surge of generative music AI applications within the past few years. As the idea of AI-generated or AI-augmented music becomes more mainstream, many researchers in the music AI community may be wondering what avenues of research are left. With regards to music generative models, we outline the current areas of research with significant room for exploration. Firstly, we pose the question of foundational representation of these generative models and investigate approaches towards explainability. Next, we discuss the current state of music datasets and their limitations. We then overview different generative models, forms of evaluating these models, and their computational constraints/limitations. Subsequently, we highlight applications of these generative models towards extensions to multiple modalities and integration with artists' workflow as well as music education systems. Finally, we survey the potential copyright implications of generative music and discuss strategies for protecting the rights of musicians. While it is not meant to be exhaustive, our survey calls to attention a variety of research directions enabled by music foundation models.
Sound,Artificial Intelligence,Multimedia,Audio and Speech Processing
What problem does this paper attempt to address?
The paper aims to explore the frontier research directions in the field of Music AI, especially in the context of the era of Foundation Models. Specifically, the paper addresses the following aspects: 1. **Fundamental Representation and Interpretability**: - Explore the fundamental representation methods of generative models and their interpretability. - Analyze the limitations of existing music datasets and discuss how to improve them. 2. **Generative Models and Evaluation Metrics**: - Overview different types of generative models and their applications in the music domain. - Discuss the computational constraints and limitations of existing models. - Propose various methods to evaluate these models and explore their effectiveness and limitations. 3. **Multimodal Integration and Artist Workflow**: - Explore the application of generative models in multimodal extensions. - How to integrate these tools into the artist's workflow and music education systems. 4. **Copyright Issues and Rights Protection**: - Investigate the potential copyright implications of generated music and discuss strategies to protect musicians' rights. Through the discussion of the above aspects, the paper hopes to highlight various research directions brought by music foundation models, encouraging researchers to explore under-explored areas. Additionally, the paper emphasizes the current challenges faced by technology, such as real-time music generation, singing voice generation, multimodal control, etc., and proposes some future research directions.