Large Language Models Perform on Par with Experts Identifying Mental Health Factors in Adolescent Online Forums

Isabelle Lorge,Dan W. Joyce,Andrey Kormilitzin

2024-04-26

Abstract:Mental health in children and adolescents has been steadily deteriorating over the past few years. The recent advent of Large Language Models (LLMs) offers much hope for cost and time efficient scaling of monitoring and intervention, yet despite specifically prevalent issues such as school bullying and eating disorders, previous studies on have not investigated performance in this domain or for open information extraction where the set of answers is not predetermined. We create a new dataset of Reddit posts from adolescents aged 12-19 annotated by expert psychiatrists for the following categories: TRAUMA, PRECARITY, CONDITION, SYMPTOMS, SUICIDALITY and TREATMENT and compare expert labels to annotations from two top performing LLMs (GPT3.5 and GPT4). In addition, we create two synthetic datasets to assess whether LLMs perform better when annotating data as they generate it. We find GPT4 to be on par with human inter-annotator agreement and performance on synthetic data to be substantially higher, however we find the model still occasionally errs on issues of negation and factuality and higher performance on synthetic data is driven by greater complexity of real data rather than inherent advantage.

Computation and Language

What problem does this paper attempt to address?

The paper aims to address some key issues in adolescent mental health monitoring and intervention, particularly the effectiveness and accuracy of using large language models (LLMs) to identify mental health factors in social media data. Specifically, the research objectives include: 1. **Generate and annotate high-quality datasets**: Create a new dataset containing posts from adolescents (aged 12-19) on Reddit, annotated by professional psychiatrists, covering six categories: trauma, instability factors, disease conditions, symptoms, suicidal tendencies, and treatment. 2. **Evaluate the performance of LLMs**: Compare the performance of two top LLMs (GPT-3.5 and GPT-4) in extracting mental health factors from adolescent social media posts, verifying whether they can achieve a level comparable to expert annotators. 3. **Explore the utility of synthetic data**: Generate two synthetic datasets to evaluate the performance of LLMs in annotating while generating text, and explore the potential use of these data in training task-specific models. Through these objectives, the research hopes to provide an efficient and cost-effective method for monitoring and intervention in the field of adolescent mental health, while also offering new insights into the application of synthetic data in the healthcare domain.

Large Language Models Perform on Par with Experts Identifying Mental Health Factors in Adolescent Online Forums

A Comprehensive Evaluation of Large Language Models on Mental Illnesses

Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

Towards Interpretable Mental Health Analysis with Large Language Models

An Assessment on Comprehending Mental Health through Large Language Models

MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models

Large Language Model for Mental Health: A Systematic Review

Large Language Models for Automatic Detection of Sensitive Topics

Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis

Applying and Evaluating Large Language Models in Mental Health Care: A Scoping Review of Human-Assessed Generative Tasks

Large Language Models Help Reveal Unhealthy Diet and Body Concerns in Online Eating Disorders Communities

Can AI Relate: Testing Large Language Model Response for Mental Health Support

Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media

MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media

Rethinking Large Language Models in Mental Health Applications

Large language models in psychiatry: Opportunities and challenges

Psychological Assessments with Large Language Models: A Privacy-Focused and Cost-Effective Approach

Watch Your Language: Investigating Content Moderation with Large Language Models

Applications of large language models in psychiatry: a systematic review

Large Language Models in Mental Health Care: a Scoping Review

Harnessing Large Language Models' Empathetic Response Generation Capabilities for Online Mental Health Counselling Support