Automatic Depression Score Estimation with Word Embedding Models (Preprint)

Anxo Pérez,Javier Parapar,Alvaro Barreiro
DOI: https://doi.org/10.2196/preprints.30484
2021-05-17
Abstract:BACKGROUND Depression is one of the most common mental health illnesses. Despite existing effective treatments, the biggest obstacle lies in an efficient and early detection of the disorder. Self-report questionnaires are the instruments exploited to elaborate a diagnosis by medical experts. However, questionnaires often encounter certain limitations. Factors such as the lack of awareness and social stigmas negatively affect the success of self-report questionnaires. In this context, social media platforms provide non-direct means of communication capable of being a source of evidence to detect patients at risk. OBJECTIVE This paper aims to describe techniques to automatically estimate the degree of depression from users on social media. We aimed to explore neural language models to exploit various aspects of the subject's writings. Our proposals have focused on automatically completing the Beck Depression Inventory-II (BDI-II). BDI-II is a validated psychometric test consisting of 21 items, each one associated with a different symptom of depression. METHODS We presented three approaches for automatically filling the BDI-II questionnaire based on neural language models. The first proposal captures the overall use of language and communication patterns evidenced by individuals. In the second proposal, we narrow the user's representation by only using limited extracted answers from their posts to the items in the BDI-II. For that, we use state-of-the-art Question Answering models based on bidirectional encoder representations. Finally, we propose a mixed model that selects whether to automatically fill an item using the first or the second model. The rationale behind the mixed model is that, on the one hand, users easily comment the answer to some items in their texts, which made the second method appropriate. On the other hand, on more private or sensitive items, the first method is the best alternative, given that users avoid writing about them explicitly. RESULTS We addressed the task "Measuring the Severity of the Signs of Depression" of eRisk 2020, an initiative in the CLEF Conference. In this task, the participants have to fill in the BDI-II for the collection delivered by the task. We measured our results using the same accuracy metrics proposed by the competition. We compared them with the rest of the 17 methods presented by participants. Our proposals outperformed almost all participants for every official metric. CONCLUSIONS Our results showed that techniques based on neural language models are a feasible alternative for estimating rating scales for depression, even when small amounts of training data are available (20 users). We observe that depending on the symptom, it will be more appropriate to use general language patterns or looking for direct concerns about the particular symptom. In summary, the results of this study have demonstrated the potential of automatic text mining models to serve as a tool helping to diagnose depression disease.
What problem does this paper attempt to address?