Sentiment Reasoning for Healthcare

Khai-Nguyen Nguyen,Khai Le-Duc,Bach Phan Tat,Duy Le,Long Vo-Dang,Truong-Son Hy
2024-10-11
Abstract:Transparency in AI healthcare decision-making is crucial for building trust among AI and users. Incorporating reasoning capabilities enables Large Language Models (LLMs) to understand emotions in context, handle nuanced language, and infer unstated sentiments. In this work, we introduce a new task -- Sentiment Reasoning -- for both speech and text modalities, along with our proposed multimodal multitask framework and dataset. Sentiment Reasoning is an auxiliary task in sentiment analysis where the model predicts both the sentiment label and generates the rationale behind it based on the input transcript. Our study conducted on both human transcripts and Automatic Speech Recognition (ASR) transcripts shows that Sentiment Reasoning helps improve model transparency by providing rationale for model prediction with quality semantically comparable to humans while also improving model performance (1% increase in both accuracy and macro-F1) via rationale-augmented fine-tuning. Also, no significant difference in the semantic quality of generated rationales between human and ASR transcripts. All code, data (English-translated and Vietnamese) and models are published online: <a class="link-external link-https" href="https://github.com/leduckhai/MultiMed" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence,Machine Learning,Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the transparency and performance of the model in speech emotion analysis in the medical and health field. Specifically, the author introduces a new task - Sentiment Reasoning, aiming to enable the model to not only predict emotion labels (positive, neutral or negative), but also generate the reasons behind the prediction. This task is achieved through a multi - task framework combining multi - modality (speech and text), thereby improving the interpretability and accuracy of the model in emotion analysis. The main challenges mentioned in the paper include: 1. **Noise and variability of speech signals**: Speech signals are inherently noisy and have significant variability due to differences in accents, speaking styles and recording conditions, which makes it complicated to extract reliable acoustic features. 2. **Subjectivity and multi - dimensionality of emotions**: Emotions are subjective, complex and multi - dimensional, and it is difficult for even humans to classify them accurately, so the help of Explainable AI is required. 3. **Transparency of medical decisions**: Given the importance of medical decisions, errors in them may have serious consequences, so improving the transparency of AI decisions is crucial for building trust among machines, medical professionals and patients. To solve the above problems, the author proposes a novel multi - modal framework for the sentiment reasoning task. By adding the rationale - augmented method during the training process, the performance of the model has been improved (the accuracy rate and macro - F1 score have been increased by 1% respectively), and the generated reasons are comparable in semantic quality to those generated by humans. In addition, there is no significant difference in semantic quality between the generated reasons based on manual transcription and automatic speech recognition (ASR) transcription.