Dynamical Behaviors of the Gradient Flows for In-Context Learning

Songtao Lu,Yingdong Lu,Tomasz Nowicki
2024-12-22
Abstract:We derive the system of differential equations for the gradient flow characterizing the training process of linear in-context learning in full generality. Next, we explore the geometric structure of the gradient flows in two instances, including identifying its invariants, optimum, and saddle points. This understanding allows us to quantify the behavior of the two gradient flows under the full generality of parameters and data.
Dynamical Systems
What problem does this paper attempt to address?