Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning

Kai Xiong,Xiao Ding,Li Du,Jiahao Ying,Ting Liu,Bing Qin,Yixin Cao
2024-08-21
Abstract:Large Language Models (LLMs) are versatile and demonstrate impressive generalization ability by mining and learning information from extensive unlabeled text. However, they still exhibit reasoning mistakes, often stemming from knowledge deficiencies, which can affect their trustworthiness and reliability. Although users can provide diverse and comprehensive queries, obtaining sufficient and effective feedback is demanding. Furthermore, evaluating LLMs comprehensively with limited labeled samples is difficult. This makes it a challenge to diagnose and remedy the deficiencies of LLMs through rich label-free user queries. To tackle this challenge, we propose a label-free curricular meaningful learning framework (LaMer). LaMer first employs relative entropy to automatically diagnose and quantify the knowledge deficiencies of LLMs in a label-free setting. Next, to remedy the diagnosed knowledge deficiencies, we apply curricular meaningful learning: first, we adopt meaningful learning to adaptively synthesize augmentation data according to the severity of the deficiencies, and then design a curricular deficiency remedy strategy to remedy the knowledge deficiencies of LLMs progressively. Experiments show that LaMer efficiently and effectively diagnoses and remedies knowledge deficiencies in LLMs, improving various LLMs across seven out-of-distribution (OOD) reasoning and language understanding benchmarks, achieving comparable results to baselines with just 40\% training data. LaMer even surpasses methods that rely on labeled datasets for deficiency diagnosis. In application, our label-free method can offer an effective knowledge deficiency diagnostic tool for efficient LLM development.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of knowledge deficiencies in large language models (LLMs) during the reasoning process and proposes a method to diagnose and fix these deficiencies without the need for labeled data. Specifically, the authors found that although existing LLMs can learn rich information from a large amount of unlabeled text, they still make reasoning errors in certain cases, mainly due to insufficient knowledge or improper application of existing knowledge. Moreover, relying on user feedback to improve LLMs is often impractical, as it requires additional effort and users typically seek answers to questions they do not fully understand themselves. To solve the above problems, the authors propose a method called "Label-free Curriculum Meaningful Learning Framework" (LaMer). LaMer first uses relative entropy to automatically diagnose the knowledge deficiencies of LLMs without relying on labeled data. Then, through a curriculum meaningful learning strategy, it adaptively generates augmented data based on the severity of the deficiencies and adopts a step-by-step repair strategy to address these knowledge deficiencies one by one. Experimental results show that LaMer can not only effectively diagnose and fix various knowledge deficiencies of LLMs but also performs well in seven different out-of-distribution reasoning and language understanding benchmarks, achieving results comparable to baseline methods with only 40% of the training data, and even surpassing methods that rely on labeled data in some cases. In summary, the goal of this paper is to develop an efficient and cost-effective method to diagnose and improve existing LLMs, thereby enhancing their reliability and trustworthiness in various application scenarios.