Abstract:Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), owing to their excellent understanding and generation abilities. Remarkably, what further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. While many downstream applications provide the model with an informational context to aid its performance on the underlying task, how the model's world knowledge interacts with the factual information presented in the context remains under explored. As a desirable behavior, an LLM should give precedence to the context whenever it contains task-relevant information that conflicts with the model's memorized knowledge. This enables model predictions to be grounded in the context, which can then be used to update or correct specific model predictions without frequent retraining. By contrast, when the context is irrelevant to the task, the model should ignore it and fall back on its internal knowledge. In this paper, we undertake a first joint study of the aforementioned two properties, namely controllability and robustness, in the context of LLMs. We demonstrate that state-of-the-art T5 and PaLM (both pretrained and finetuned) could exhibit poor controllability and robustness, which do not scale with increasing model size. As a solution, we propose a novel method - Knowledge Aware FineTuning (KAFT) - to strengthen both controllability and robustness by incorporating counterfactual and irrelevant contexts to standard supervised datasets. Our comprehensive evaluation showcases the utility of KAFT across model architectures and sizes.

How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study

Probing Language Models on Their Knowledge Source

Supervised Knowledge Makes Large Language Models Better In-context Learners

Can Large Language Models Understand Context?

Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers

Large Language Models with Controllable Working Memory

Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis

Towards Uncovering How Large Language Model Works: An Explainability Perspective

When Context Leads but Parametric Memory Follows in Large Language Models

Large Language Models Know What Makes Exemplary Contexts

Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models

Can large language models explore in-context?

Probing LLMs for Joint Encoding of Linguistic Categories

Probing Causality Manipulation of Large Language Models

Why Larger Language Models Do In-context Learning Differently?

Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Probing Pretrained Language Models for Lexical Semantics

E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning

Large Language Models are In-context Teachers for Knowledge Reasoning

Context Matter: Data-Efficient Augmentation of Large Language Models for Scientific Applications