Abstract:Introduction: Real-time evaluations of the severity of depressive symptoms are of great significance for the diagnosis and treatment of patients with major depressive disorder (MDD). In clinical practice, the evaluation approaches are mainly based on psychological scales and doctor-patient interviews, which are time-consuming and labor-intensive. Also, the accuracy of results mainly depends on the subjective judgment of the clinician. With the development of artificial intelligence (AI) technology, more and more machine learning methods are used to diagnose depression by appearance characteristics. Most of the previous research focused on the study of single-modal data; however, in recent years, many studies have shown that multi-modal data has better prediction performance than single-modal data. This study aimed to develop a measurement of depression severity from expression and action features and to assess its validity among the patients with MDD. Methods: We proposed a multi-modal deep convolutional neural network (CNN) to evaluate the severity of depressive symptoms in real-time, which was based on the detection of patients' facial expression and body movement from videos captured by ordinary cameras. We established behavioral depression degree (BDD) metrics, which combines expression entropy and action entropy to measure the depression severity of MDD patients. Results: We found that the information extracted from different modes, when integrated in appropriate proportions, can significantly improve the accuracy of the evaluation, which has not been reported in previous studies. This method presented an over 74% Pearson similarity between BDD and self-rating depression scale (SDS), self-rating anxiety scale (SAS), and Hamilton depression scale (HAMD). In addition, we tracked and evaluated the changes of BDD in patients at different stages of a course of treatment and the results obtained were in agreement with the evaluation from the scales. Discussion: The BDD can effectively measure the current state of patients' depression and its changing trend according to the patient's expression and action features. Our model may provide an automatic auxiliary tool for the diagnosis and treatment of MDD.

Visually Interpretable Representation Learning for Depression Recognition from Facial Images

Dynamic Facial Features in Positive-Emotional Speech for Identification of Depressive Tendencies

Automatic Depression Prediction Via Cross-Modal Attention-Based Multi-Modal Fusion in Social Networks

Automatic Assessment of Depression from Speech Via a Hierarchical Attention Transfer Network and Attention Autoencoders

Deep Neural Networks for Depression Recognition Based on 2D and 3D Facial Expressions Under Emotional Stimulus Tasks

PRA-Net: Part-and-Relation Attention Network for depression recognition from facial expression

Hybrid Network Feature Extraction for Depression Assessment from Speech

DepNet: An automated industrial intelligent system using deep learning for video‐based depression analysis

Learning Content-Adaptive Feature Pooling for Facial Depression Recognition in Videos

Depressioner: Facial dynamic representation for automatic depression level prediction

FacialPulse: An Efficient RNN-based Depression Detection via Temporal Facial Landmarks

Catching Elusive Depression via Facial Micro-Expression Recognition

Dual‐task enhanced global–local temporal–spatial network for depression recognition from facial videos

Automatic diagnosis of depression based on attention mechanism and feature pyramid model

Multi-Scale and Multi-Region Facial Discriminative Representation for Automatic Depression Level Prediction.

Neural Architecture Searching for Facial Attributes-based Depression Recognition

Speech depression recognition based on attentional residual network

Automatic identification of depressive symptoms in college students: an application of deep learning-based CNN (Convolutional Neural Network)

Hypergraph Neural Network for Multimodal Depression Recognition

A Deep Multiscale Spatiotemporal Network for Assessing Depression from Facial Dynamics

Measuring depression severity based on facial expression and body movement using deep convolutional neural network