Speech Emotion Recognition Using openSMILE and GPT 3.5 Transformer

Darius Turcian,Vasile Stoicu-Tivadar
DOI: https://doi.org/10.3233/SHTI240562
2024-08-22
Abstract:In recent years, artificial intelligence, and machine learning (ML) models have advanced significantly, offering transformative solutions across diverse sectors. Emotion recognition in speech has particularly benefited from ML techniques, revolutionizing its accuracy and applicability. This article proposes a method for emotion detection in Romanian speech analysis by combining two distinct approaches: semantic analysis using GPT Transformer and acoustic analysis using openSMILE. The results showed an accuracy of 74% and a precision of almost 82%. Several system limitations were observed due to the limited and low-quality dataset. However, it also opened a new horizon in our research by analyzing emotions to identify mental health disorders.
What problem does this paper attempt to address?