ERD: A Framework for Improving LLM Reasoning for Cognitive Distortion Classification

Sehee Lim,Yejin Kim,Chi-Hyun Choi,Jy-yong Sohn,Byung-Hoon Kim
2024-03-21
Abstract:Improving the accessibility of psychotherapy with the aid of Large Language Models (LLMs) is garnering a significant attention in recent years. Recognizing cognitive distortions from the interviewee's utterances can be an essential part of psychotherapy, especially for cognitive behavioral therapy. In this paper, we propose ERD, which improves LLM-based cognitive distortion classification performance with the aid of additional modules of (1) extracting the parts related to cognitive distortion, and (2) debating the reasoning steps by multiple agents. Our experimental results on a public dataset show that ERD improves the multi-class F1 score as well as binary specificity score. Regarding the latter score, it turns out that our method is effective in debiasing the baseline method which has high false positive rate, especially when the summary of multi-agent debate is provided to LLMs.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address two major challenges that existing large language models (LLMs) face in recognizing cognitive distortions: 1. **Over-diagnosis**: Existing methods (such as Diagnosis-of-Thought, DoT) tend to over-diagnose cognitive distortions, incorrectly inferring unreasonable thinking patterns even when the user's statements are harmless. 2. **Poor multi-class classification performance**: In a multi-class setting, the classification performance of the DoT method is close to random guessing, which limits its use in practical applications. To address these issues, the authors propose a new framework—ERD (Extraction-Reasoning-Debate), which improves the cognitive distortion classification performance of LLMs by introducing modules for extracting relevant parts and multi-agent debate. Specifically, the ERD framework includes the following three steps: 1. **Extraction**: Extracting parts related to cognitive distortions from the user's discourse. 2. **Reasoning**: Generating the thought process that estimates cognitive distortions. 3. **Debate**: Multiple agent LLMs discuss the reasoning process and make a final decision. Experimental results show that ERD significantly outperforms existing baseline methods in both multi-class F1 score and binary specificity score. In particular, ERD excels in reducing the false positive rate.