Abstract:Predominant instrument recognition plays a vital role in music information retrieval. This task involves identifying and categorizing the dominant instruments present in a piece of music based on their distinctive time-frequency characteristics and harmonic distribution. Existing predominant instrument recognition approaches mainly focus on learning implicit mappings (such as deep neural networks) from time-domain or frequency-domain representations of music audio to instrument labels. However, different instruments playing in polyphonic music produce local superposed time-frequency representations while most implicit models could be sensitive to such local data changes. This thus poses a challenge for these implicit methods to accurately capture the unique harmonic features of each instrument. To address this challenge, considering that the complete harmonic information of an instrument is also distributed across a wide range of frequencies, we design a label-specific time-frequency feature learning approach to convert the task of building implicit classification mappings into the process of extracting and matching features that are specific to each instrument, as a result, a new explicit learning model: label-specific time-frequency energy-based neural network (LSTN) is proposed. Unlike existing implicit models, LSTN not only extracts their commonly used local time-frequency features but also incorporates time-domain factors and frequency-domain factors in its energy function to explicitly parameterize the long-term correlation and long-frequency correlation features. Using the extracted time-frequency features and the two long correlation features as instrument label-specific features, LSTN detects whether the harmonic distribution of each instrument appears in polyphonic music on both long time-frequency scales and local time-frequency scales to mitigate the challenges posed by local superposed representations. We conduct an analysis of the complexity and the convergence of LSTN, then experiments conducted on benchmark datasets demonstrate the superiority of LSTN over other established instrument recognition algorithms.

Label-Specific Time-Frequency Energy-Based Neural Network for Instrument Recognition

Frame-level Instrument Recognition by Timbre and Pitch

Timbre-Based Portable Musical Instrument Recognition Using LVQ Learning Algorithm

A Multitask Learning Approach for Chinese National Instruments Recognition and Timbre Space Regression

Deep convolutional neural networks for predominant instrument recognition in polyphonic music

Automatic Instrument Recognition in Polyphonic Music Using Convolutional Neural Networks

An Instrument Indication Acquisition Algorithm Based on Lightweight Deep Convolutional Neural Network and Hybrid Attention Fine-Grained Features

Deep Neural Network for Musical Instrument Recognition using MFCCs

Visual Attention for Musical Instrument Recognition

An Attention Mechanism for Musical Instrument Recognition

Temporal Coding of Local Spectrogram Features for Robust Sound Recognition

Musical Instrument Classification via Low-Dimensional Feature Vectors

Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music

Construction of Intelligent Recognition and Learning Education Platform of National Music Genre Under Deep Learning

Multitask learning for frame-level instrument recognition

Spike-based Encoding and Learning of Spectrum Features for Robust Sound Recognition.

Application of Hidden Markov Chain and Artificial Neural Networks in Music Recognition and Classification

Robust Multipitch Estimation Of Piano Sounds Using Deep Spiking Neural Networks

Carnatic Raga Identification System using Rigorous Time-Delay Neural Network

Lute Acoustic Quality Evaluation and Note Recognition Based on the Softmax Regression BP Neural Network

Music time signature detection using ResNet18