PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation

Jinpeng Hu,Tengteng Dong,Luo Gang,Hui Ma,Peng Zou,Xiao Sun,Dan Guo,Meng Wang
2024-08-07
Abstract:Mental health has attracted substantial attention in recent years and LLM can be an effective technology for alleviating this problem owing to its capability in text understanding and dialogue. However, existing research in this domain often suffers from limitations, such as training on datasets lacking crucial prior knowledge and evidence, and the absence of comprehensive evaluation methods. In this paper, we propose a specialized psychological large language model (LLM), named PsycoLLM, trained on a proposed high-quality psychological dataset, including single-turn QA, multi-turn dialogues and knowledge-based QA. Specifically, we construct multi-turn dialogues through a three-step pipeline comprising generation, evidence judgment, and refinement. We augment this process with real-world psychological case backgrounds extracted from online platforms, enhancing the relevance and applicability of the generated data. Additionally, to compare the performance of PsycoLLM with other LLMs, we develop a comprehensive psychological benchmark based on authoritative psychological counseling examinations in China, which includes assessments of professional ethics, theoretical proficiency, and case analysis. The experimental results on the benchmark illustrates the effectiveness of PsycoLLM, which demonstrates superior performance compared to other LLMs.
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address key challenges faced in the application of psychological language models in the field of mental health. Specifically, these issues include: 1. **Quality of datasets**: Existing psychological language models are often trained on datasets that lack important prior knowledge and evidence, which may result in generated data that does not align with real-world psychological communication scenarios. 2. **Insufficient evaluation methods**: Most current research relies on expert subjective judgment or overly depends on other large models when evaluating psychological language models. These methods may not fully reflect the nuanced performance of the models. To address these issues, the paper proposes a specialized psychological language model (PsycoLLM), which is trained on high-quality psychological datasets, including single-turn Q&A, multi-turn dialogues, and knowledge-based Q&A. Additionally, the paper introduces a comprehensive benchmark test based on the authoritative Chinese psychological counseling exam to thoroughly evaluate the model's performance. This benchmark test includes three parts: professional ethics, theoretical knowledge, and case analysis, aiming to comprehensively assess the model's theoretical mastery and practical application capabilities.