Spoofed Voice Detection using Dense Features of STFT and MDCT Spectrograms

Aniqa Dilawari,Summra Saleem,Usman Ghani Khan
DOI: https://doi.org/10.1109/ICAI52203.2021.9445259
2021-04-05
Abstract:Attestation of audio signals for recognition of forgery in voice is challenging task. In this research work, a deep convolutional neural network (CNN) is utilized to detect audio operations i.e. pitch shifted and amplitude varied signals. Short-time Fourier transform (STFT) and Modified Discrete Cosine Transform (MDCT) features are chosen for audio processing and their plotted patterns are fed to CNN. Experimental results show that our model can successfully distinguish tampered signals to facilitate the audio authentication on TIMIT dataset. Proposed CNN architecture can distinguish spoofed voices of shifting pitch with accuracy of 97.55% and of varying amplitude with accuracy of 98.85%.
Computer Science
What problem does this paper attempt to address?