Energy-Based Recurrent Model For Stochastic Modeling Of Music

Yingru Liu,Dongliang Xie,Xin Wang
DOI: https://doi.org/10.1109/ICME.2019.00049
2019-01-01
Abstract:The aim of this work is to more accurately model the stochastic process of music-related data, which is essential for many AI applications in musicology. When music is naturally represented as a sequence of vectorized frames, existing models generally cannot well capture the correlation of the elements inside each frame. We propose an energy-based model called Chain Graphical Recurrent Neural Network (CGRNN) to explore the correlation of elements for more accurate modeling of the dynamics of music. In CGRNN, a probabilistic sub-structure named Conditional spike-and-slab Restricted Boltzmann Machine (C-ssRBM) is defined to better model the conditional covariance and joint distribution of elements in a frame. Besides, CGRNN is capable of tracking the evolution of music and extracting sparse features with an efficient design of temporal transition. With the estimated stochastic process of music, we further implement CGRNN to generate melodious music automatically. Extensive empirical evaluations of multiple unsupervised learning tasks are conducted on symbolic MIDI and audio sounds to demonstrate the performance of our model.
What problem does this paper attempt to address?