Deep Learning Algorithms based Voiceprint Recognition System in Noisy Environment

Hajer Y. Khdier,Wesam M. Jasim,Salah A. Aliesawi
DOI: https://doi.org/10.1088/1742-6596/1804/1/012042
2021-02-01
Journal of Physics: Conference Series
Abstract:Abstract Voiceprint Recognition (VPR) is the mechanism by which a user’s so-called identity is determined using characteristics taken from their voice, where this-technique is one of the world’s most useful and common biometric recognition techniques particularly the fields-relevant to security. These can be used for authentication, monitoring, forensic identification of speakers, and a variety of related activities. In this work, an attempt is applied to create a system that recognizes human speaker identity using Convolutional Neural Network (CNN). Two methods are used in this work which are MFCC-CNN and RW-CNN. The first method is standard method using MFCC, to use the features in the audio, where these features are will be entered into CNN to perform a process. The training CNN will take input as a picture and then the process of training via the proposed CNN is beginning. The second method, RW-CNN, the same steps as the first method, but without going through the MFCC phases where direct entry to CNN. In which, the same CNN structure was used in both methods. In this work, a 96% accuracy gained for both RW-CNN and MFCC-CNN. Both methods are similar in their results, either with or without noise, but the performance is mixed. This system can deep learn a large amount of human voices with high accuracy and minimum processes requirement.
What problem does this paper attempt to address?