N-gram Inverted Index Structures on Music Data for Theme Mining and Content-Based Information Retrieval

CK Wang,JZ Li,SF Shi
DOI: https://doi.org/10.1016/j.patrec.2005.09.012
IF: 4.757
2006-01-01
Pattern Recognition Letters
Abstract:Content-based music information retrieval and theme mining are two key problems in digital music information systems, where ''themes'' mean the longest-repeating patterns in a piece of music. However, most data structures constructed for retrieving music data cannot be efficiently used to mine the themes of music pieces, and vice versa. The suffix tree structure can be used for both functions, nevertheless its size is too large and its maintenance is somewhat difficult. In this paper, a kind of index structure is introduced, which adopts the idea of inverted files and that of n-gram. It can be used to retrieve music data as well as to mine music themes. Based on the index and several useful concepts, a theme mining algorithm is proposed, and the theoretical analysis is also given. In addition, two implementations of a content-based music information retrieval algorithm are presented. Experiments show the correctness and efficiency of the proposed index and algorithms.
What problem does this paper attempt to address?