Applying and Evaluating Large Language Models in Mental Health Care: A Scoping Review of Human-Assessed Generative Tasks

Yining Hua,Hongbin Na,Zehan Li,Fenglin Liu,Xiao Fang,David Clifton,John Torous
2024-08-21
Abstract:Large language models (LLMs) are emerging as promising tools for mental health care, offering scalable support through their ability to generate human-like responses. However, the effectiveness of these models in clinical settings remains unclear. This scoping review aimed to assess the current generative applications of LLMs in mental health care, focusing on studies where these models were tested with human participants in real-world scenarios. A systematic search across APA PsycNet, Scopus, PubMed, and Web of Science identified 726 unique articles, of which 17 met the inclusion criteria. These studies encompassed applications such as clinical assistance, counseling, therapy, and emotional support. However, the evaluation methods were often non-standardized, with most studies relying on ad hoc scales that limit comparability and robustness. Privacy, safety, and fairness were also frequently underexplored. Moreover, reliance on proprietary models, such as OpenAI's GPT series, raises concerns about transparency and reproducibility. While LLMs show potential in expanding mental health care access, especially in underserved areas, the current evidence does not fully support their use as standalone interventions. More rigorous, standardized evaluations and ethical oversight are needed to ensure these tools can be safely and effectively integrated into clinical practice.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the application of large language models (LLMs) in mental health care and the evaluation of their effectiveness. Specifically, the paper aims to evaluate the current generative applications of LLMs in mental health care, especially those models tested by human participants in real - world scenarios. The paper focuses on the following aspects: 1. **Scope of application**: It explores various applications of LLMs in mental health care, such as clinical assistance, counseling, treatment, and emotional support. 2. **Evaluation methods**: It points out that current evaluation methods are often non - standardized, and most studies rely on ad - hoc scoring criteria, which limit the comparability and robustness of the results. 3. **Privacy, security and fairness**: It emphasizes that these aspects are often overlooked in existing studies. 4. **Transparency and reproducibility**: It points out that due to the reliance on proprietary models (such as OpenAI's GPT series), there are concerns about transparency and reproducibility. 5. **Potential value and limitations**: Although LLMs show potential in expanding the accessibility of mental health care, especially in underserved areas, the current evidence is not sufficient to support their use as an independent intervention. Overall, the paper aims to identify the current application status and existing problems of LLMs in mental health care through a systematic review of existing studies, and propose directions for future research and development.