Vietnamese Voice Classification based on Deep Learning Approach

Bui Thanh Hung
DOI: https://doi.org/10.30991/ijmlnce.2020v04i04.004
2020-01-01
International Journal of Machine Learning and Networked Collaborative Engineering
Abstract:In the digital era, it is undeniable that voice classification plays a meaningful task in various aspects of life. In this research, we propose a method of predicting the gender and region of the Vietnamese voice which is based on the spectrum of sound using the deep learning approach. From the raw dataset, we conducted the preprocessing stage to take the audio dataset to the same frequency and time standard. After that, we extracted Mel Spectrogram feature and then put into a deep learning model - Convolutional Neural Network to train and optimize. Our experiments on 37 samples taken from VIVOS corpus audio dataset achieve the accuracy of 86.48% for predicting gender and 51.45% for predicting the region of the voice
What problem does this paper attempt to address?