A Survey of AI Music Generation Tools and Models

Yueyue Zhu,Jared Baca,Banafsheh Rekabdar,Reza Rawassizadeh
2023-08-24
Abstract:In this work, we provide a comprehensive survey of AI music generation tools, including both research projects and commercialized applications. To conduct our analysis, we classified music generation approaches into three categories: parameter-based, text-based, and visual-based classes. Our survey highlights the diverse possibilities and functional features of these tools, which cater to a wide range of users, from regular listeners to professional musicians. We observed that each tool has its own set of advantages and limitations. As a result, we have compiled a comprehensive list of these factors that should be considered during the tool selection process. Moreover, our survey offers critical insights into the underlying mechanisms and challenges of AI music generation.
Sound,Artificial Intelligence,Human-Computer Interaction,Audio and Speech Processing
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to provide a comprehensive survey covering AI music generation tools and their models. Specifically: 1. **Terminology Explanation**: First, it explains basic concepts to readers unfamiliar with music creation to better understand the subsequent content. 2. **Overview of Tools and Models**: It discusses in detail the current AI music generation tools and models, evaluates their functionalities, and discusses their limitations. 3. **Technical Analysis**: By analyzing the latest tools and technologies, it provides a comprehensive understanding of the potential of AI-based music creation and points out areas that need improvement to enhance performance. ### Overview of Main Content 1. **Classification of Music Generation Methods**: - Parametric methods (e.g., Markov chains, rule-based systems, and genetic algorithms). - Text-based methods (e.g., generating music through descriptive text). - Visual-based methods (e.g., generating music through images or videos). 2. **Non-Neural Network Methods**: - Markov Chains: Used to analyze existing music patterns and create new works. - Rule-Based Systems: Generate music in specific styles based on predefined rules. - Genetic Algorithms: Generate new music sequences through selection, mutation, and crossover operations. 3. **Neural Network-Based Methods**: - Parametric Models (e.g., Magenta, Jukebox, etc.): Require specific input parameters to generate music. - Prompt-Based Models (e.g., Riffusion, Moˆ usai, etc.): Allow users to input descriptive text as prompts. - Visual-Based Models (e.g., CMT, V-MusProd, etc.): Utilize video backgrounds to generate music. 4. **Specific Tool Analysis**: - Magenta: An open-source research project that provides various neural network models, including RNN/LSTM, MusicVAE, and GANSynth. - Other Commercial Tools: Provide customized parameter input interfaces. ### Summary This paper, through a detailed survey of various AI music generation tools and models, showcases the functionalities, advantages, and limitations of these tools. The ultimate goal is to provide researchers, musicians, and general users with a comprehensive understanding framework to better utilize these technologies for music creation.