Predicting Individual Depression Symptoms from Acoustic Features During Speech

Sebastian Rodriguez,Sri Harsha Dumpala,Katerina Dikaios,Sheri Rempel,Rudolf Uher,Sageev Oore

2024-06-23

Abstract:Current automatic depression detection systems provide predictions directly without relying on the individual symptoms/items of depression as denoted in the clinical depression rating scales. In contrast, clinicians assess each item in the depression rating scale in a clinical setting, thus implicitly providing a more detailed rationale for a depression diagnosis. In this work, we make a first step towards using the acoustic features of speech to predict individual items of the depression rating scale before obtaining the final depression prediction. For this, we use convolutional (CNN) and recurrent (long short-term memory (LSTM)) neural networks. We consider different approaches to learning the temporal context of speech. Further, we analyze two variants of voting schemes for individual item prediction and depression detection. We also include an animated visualization that shows an example of item prediction over time as the speech progresses.

Sound,Artificial Intelligence,Machine Learning,Audio and Speech Processing

What problem does this paper attempt to address?

The problem this paper attempts to address is predicting specific items of individual depression symptoms through the acoustic features of speech. Existing automatic depression detection systems typically provide overall prediction results directly, without relying on specific symptom items in clinical depression rating scales. In contrast, clinicians assess each item in detail when evaluating depression, thereby providing more detailed evidence for diagnosis. Therefore, the goal of this paper is to use the acoustic features of speech to first predict each item in the depression rating scale, and then make the final depression prediction. Specifically, the main contributions of the paper include: 1. **Using acoustic features to predict individual symptoms**: The research team attempts to use Convolutional Neural Networks (CNN) and Long Short-Term Memory networks (LSTM) to predict each item in the depression rating scale. 2. **Analyzing different temporal context learning methods**: The study explores different methods to learn the temporal context information of speech. 3. **Voting scheme analysis**: The paper analyzes the impact of two different voting schemes (hard voting and soft voting) on individual item prediction and depression detection. 4. **Visualizing prediction results**: An animated visualization example is provided, showing the item prediction process as the speech progresses. Through these methods, the research team hopes to better understand the decision-making process of machine learning models in depression detection and provide more detailed diagnostic evidence.

Predicting Individual Depression Symptoms from Acoustic Features During Speech

Hybrid Network Feature Extraction for Depression Assessment from Speech

Automatic Assessment of Depression from Speech Via a Hierarchical Attention Transfer Network and Attention Autoencoders

Dynamic Facial Features in Positive-Emotional Speech for Identification of Depressive Tendencies

Hierarchical Attention Transfer Networks for Depression Assessment from Speech

Self-Supervised Embeddings for Detecting Individual Symptoms of Depression

Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN

Automated speech-based screening of depression using deep convolutional neural networks

Evaluating Acoustic and Linguistic Features of Detecting Depression Sub-Challenge Dataset

Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks

Towards automatic text-based estimation of depression through symptom prediction

Improving Depression Prediction Accuracy Using Fisher Score-Based Feature Selection and Dynamic Ensemble Selection Approach Based on Acoustic Features of Speech

Automated depression analysis using convolutional neural networks from speech

Siamese Neural Network for Speech-Based Depression Classification and Severity Assessment

Automatic Detection of Depression from Stratified Samples of Audio Data

MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech

Arabic Speech Analysis for Classification and Prediction of Mental Illness due to Depression Using Deep Learning

Depression detection using cascaded attention based deep learning framework using speech data

WavDepressionNet: Automatic Depression Level Prediction Via Raw Speech Signals

What You Say or How You Say It? Depression Detection Through Joint Modeling of Linguistic and Acoustic Aspects of Speech

Bayesian Networks for the robust and unbiased prediction of depression and its symptoms utilizing speech and multimodal data