Iterative Graph Alignment

Fangyuan Yu,Hardeep Singh Arora,Matt Johnson
2024-08-30
Abstract:By compressing diverse narratives, LLMs go beyond memorization, achieving intelligence by capturing generalizable causal relationships. However, they suffer from local 'representation gaps' due to insufficient training data diversity, limiting their real-world utility, especially in tasks requiring strict alignment to rules. Traditional alignment methods relying on heavy human annotations are inefficient and unscalable. Recent self-alignment techniques also fall short, as they often depend on self-selection based prompting and memorization-based learning. To address these issues, we introduce Iterative Graph Alignment (IGA), an annotation-free rule-based alignment algorithm. A teacher model (VLM) employs Iterative Graph Prompting (IGP) to create logical graphs and reference answers. The student model (LLM) identifies local knowledge gaps by attempting to align its responses with these references, collaborating with helper models to generate diverse answers. These aligned responses are then used for iterative supervised fine-tuning (SFT). Our evaluations across five rule-based scenarios demonstrate IGP's effectiveness, with a 73.12\% alignment improvement in Claude Sonnet 3.5, and Llama3-8B-Instruct achieving an 86.20\% improvement, outperforming Claude Sonnet 3.5 in rule-based alignment.
Machine Learning,Artificial Intelligence,Computation and Language,Multiagent Systems
What problem does this paper attempt to address?
The paper attempts to address the "representation gap" problem in large language models (LLMs) caused by insufficient diversity in training data. Specifically: 1. **Representation Gap Problem**: Existing large language models tend to rely on memory rather than generalization when handling tasks that require strict rule alignment, due to insufficient information representation in the training data, leading to inappropriate responses. 2. **Limitations of Traditional Methods**: Traditional alignment methods rely on extensive manual annotation, which is inefficient and difficult to scale; some recent self-alignment techniques depend on self-selected prompts and memory-based learning, with limited effectiveness. 3. **Proposed New Method**: The paper introduces a rule alignment algorithm that does not require manual annotation—Iterative Graph Alignment (IGA). It uses a teacher model (VLM) to generate logic graphs and reference answers, and then has the student model (LLM) attempt to align with them to identify knowledge gaps. By collaborating with auxiliary models to generate diverse answers, these aligned answers are subsequently used for iterative supervised fine-tuning (SFT) to improve model performance. Overall, the paper aims to enhance the performance of large language models in rule alignment tasks through a new self-alignment mechanism, narrowing the representation gap, and improving the model's practicality and robustness.