Design and construction of 14 Arabic fricatives dataset, classification and characterization using CRNN, transformers, and H-CRNN

Aicha Zitouni,Leila Falek,Aissa Amrouche,Brahim Dahou,Mourad Abbas
DOI: https://doi.org/10.1007/s11042-024-18355-0
IF: 2.577
2024-02-23
Multimedia Tools and Applications
Abstract:A fricative sound is produced by the close proximity of two articulators, resulting in a partially obstructed airstream and turbulent airflow. The frequency spectrum of the majority of fricatives is similar to noise. This peculiarity presents a significant challenge in their numerical processing. Nonetheless, the literature contains a large number of works for fricatives, particularly those of Latin languages. However, very few are related to the Arabic language. The objective of this article is to create and validate a dataset of 14 Arabic fricatives using an acoustic characterization approach. We recorded a speech corpus from thirty speakers for this purpose, yielding 25 200 units of analysis. The data was then analyzed and validated using deep learning-based classification methods. As a result, two distinct approaches were proposed. The first approach is based on the use of three deep neural networks (Convolutional Recurrent Neural Network (CRNN), Vision Transformers (ViT), and ResNet50) whereas their input is one of three feature extraction techniques (spectrogram, Mel-spectrogram, and Mel-frequency Cepstral Coefficient (MFCC)) from speech units /Vowel-Consonant-Vowel/ (or /VCV/). The best classification rate observed when using the Mel spectrogram with the CRNN model was 94.07%, while the ViT model surpassed this with a rate of 95.64%. This rate was improved to 97.5% by proposing a second approach based on a hierarchical H-CRNN model with fricatives organized into categories (sibilance, voicing, and place of articulation). The obtained results surpassed the traditional approaches proposed in the literature for the classification of Arabic fricatives due to the hierarchical approach.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?