Improved harmonic spectral envelope extraction for singer classification with hybridised model

Balachandra Kumaraswamy
DOI: https://doi.org/10.1504/ijbic.2024.141676
2024-10-02
International Journal of Bio-Inspired Computation
Abstract:The singing voice has an effect on humans with the addition of expressions, lyrics, and instruments. It is easier for human beings to distinguish the singing tone of voice from a specified auditory clip owing to an individual's perceptual tools and audible physiology. On the other, without human intervention, it is not simple to identify non-vocal portions, vocal portions, feelings, and singers from the related signal owing to intrinsic complications. This proposed a new singer classification mechanism with four stages: 'pre-processing, vocal segmentation, feature extraction, and classification'. Initially, first stage, an 'improved convolutional neural network (CNN)' is deployed for the segmentation of the vocal part. Further, features like 'zero crossing rate (ZCR), Mel-frequency cepstral coefficients (MFCCs), vibration estimation and improved harmonic spectral envelope' are derived to 'bidirectional gated recurrent unit (BI-GRU) and long short-term memory (LSTM)'. The results from LSTM and BI-GRU are median and the final result is attained.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?