Evaluating Large Language Models for Anxiety and Depression Classification using Counseling and Psychotherapy Transcripts

Junwei Sun,Siqi Ma,Yiran Fan,Peter Washington
2024-07-18
Abstract:We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts. We fine-tune both established transformer models (BERT, RoBERTa, Longformer) and more recent large models (Mistral-7B), trained a Support Vector Machine with feature engineering, and assessed GPT models through prompting. We observe that state-of-the-art models fail to enhance classification outcomes compared to traditional machine learning methods.
Computation and Language,Computers and Society,Emerging Technologies,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the effectiveness of traditional machine - learning methods and large language models (LLMs) in classifying anxiety and depressive symptoms from long - conversation records. Specifically, researchers hope to explore whether the current state - of - the - art deep - learning techniques can significantly improve the ability to recognize complex mental states by comparing the performance of different methods (such as Support Vector Machines, BERT, RoBERTa, Longformer and Mistral - 7B, etc.), especially when dealing with long - term psychological counseling and psychotherapy conversation texts. In addition, the study also examines the ability to classify psychological symptoms by calling GPT - series models through APIs to evaluate the accuracy and stability of these models in practical applications. Overall, the paper aims to explore and compare the performance of various machine - learning and deep - learning methods in mental health diagnosis tasks, especially the processing ability for long - text inputs.