Abstract:one of the most important subtasks of automatic music transcription (AMT), multi-pitch estimation (MPE) has been studied extensively for predicting the fundamental frequencies in the frames of audio recordings during the past decade. However, how to use music perception and cognition for MPE has not yet been thoroughly investigated. Motivated by this, this demonstrates how to effectively detect the fundamental frequency and the harmonic structure of polyphonic music using a cognitive framework. Inspired by cognitive neuroscience, an integration of the constant Q transform and a state-of-the-art matrix factorization method called shift-invariant probabilistic latent component analysis (SI-PLCA) are proposed to resolve the polyphonic short-time magnitude log-spectra for multiple pitch estimation and source-specific feature extraction. The cognitions of rhythm, harmonic periodicity and instrument timbre are used to guide the analysis of characterizing contiguous notes and the relationship between fundamental frequency and harmonic frequencies for detecting the pitches from the outcomes of SI-PLCA. In the experiment, we compare the performance of proposed MPE system to a number of existing state-of-the-art approaches (seven weak learning methods and four deep learning methods) on three widely used datasets (i.e. MAPS, BACH10 and TRIOS) in terms of F-measure ( F_1 ) values. The experimental results show that the proposed MPE method provides the best overall performance against other existing methods.

Learning optimal features for music transcription

Harmonic Frequency-Separable Transformer for Instrument-Agnostic Music Transcription

Harmonic-Aware Frequency and Time Attention for Automatic Piano Transcription

Automatic Transcription Of Real World Piano Music Using Harmonic Model

Improved Architecture for High-resolution Piano Transcription to Efficiently Capture Acoustic Characteristics of Music Signals

Automatic Piano Transcription with Hierarchical Frequency-Time Transformer

Multitrack Music Transcription with a Time-Frequency Perceiver

Stereo Feature Enhancement and Temporal Information Extraction Network for Automatic Music Transcription

A Music Cognition–Guided Framework for Multi-pitch Estimation

Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription

Polyphonic Piano Transcription Based on Graph Convolutional Network

DAFE-MSGAT: Dual-Attention Feature Extraction and Multi-Scale Graph Attention Network for Polyphonic Piano Transcription

Piano Transcription with Harmonic Attention.

HPPNet: Modeling the Harmonic Structure and Pitch Invariance in Piano Transcription

Investigation on the use of Hidden-Markov Models in automatic transcription of music

Semi-Supervised Convolutive NMF for Automatic Piano Transcription

Particle Filtering for PLCA model with Application to Music Transcription

Calibration of a two-state pitch-wise HMM method for note segmentation in Automatic Music Transcription systems

Automatic Transcription Method for Polyphonic Music Based on Adaptive Comb Filter and Neural Network

Improving Automatic Piano Transcription by Refined Feature Fusion and Weighted Loss

Automatic Note Recognition and Generation of MDL and MML using FFT