D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis
Swasthika Jain T J,I. Jeena Jacob,Ajay Kumar Mandava
DOI: https://doi.org/10.1007/s11042-023-14351-y
IF: 2.577
2023-01-12
Multimedia Tools and Applications
Abstract:Nowadays, depression heavily affects humans' physical and mental health. Depression occurs due to changes in mood, loss of interest, and stress, which leads to self-harm events and suicide. Thus analyzing depression is very important to reduce suicidal acts. In recent years, automatic depression evaluation has been developed in computer vision technology. Several models were investigated for depression analysis, but they are limited only to video and audio data analysis. In this paper, hybrid Artificial Intelligence (AI) based Multi-modal depression analysis was proposed in which the severity of depression from multi-modal data such as video, audio and text descriptors are extracted. Initially, the proposed approach estimates the Patient Health Questionnaire (PHQ) depression scale by a hybrid framework Residual Network based Deep Neural Network (D-ResNet), which computes the PHQ-8 score from video and audio features. Then, Paragraph Vector Kernel Extreme Learning Machine (PV-KELM) is developed to infer the mental and physical states of the individuals related to the psychoanalytic features of depression. It recognizes the absence (or) presence of the measured psychoanalytic symptoms. Finally, the estimated PHQ-8 score and psychoanalytic symptoms are extracted from the Residual Network based Deep Neural Network and the Paragraph Vector based Kernel Extreme Learning Machine, which is fed together into the ensemble classifier. In the ensemble classifier, three classifiers are used, namely Support Vector Machine (SVM), Naive-Bayes (NB), and Decision Tree (DT) classifier, to classify whether the individual is depressed or not. The proposed approach is implemented in PYTHON software, and the experiments will be carried out using the Distress Analysis Interview Corpus-Wizard of -OZ interview depression dataset. By using the proposed approach, the accuracy, precision, recall, F-measure, RMSE, MAE, JSD and Contextual similarity obtained are 0.89, 0.86, 0.86 and 0.86, 0.373, 0.35, 0.355 and 0.689 respectively. Our proposed approach has been compared with the state-of-the-art approaches, and the performance result shows the efficiency of the proposed approach.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering