Approximately Aligned Decoding

Daniel Melcer,Sujan Gonugondla,Pramuditha Perera,Haifeng Qian,Wen-Hao Chiang,Yanjun Wang,Nihal Jain,Pranav Garg,Xiaofei Ma,Anoop Deoras
2024-10-02
Abstract:It is common to reject undesired outputs of Large Language Models (LLMs); however, current methods to do so require an excessive amount of computation, or severely distort the distribution of outputs. We present a method to balance the distortion of the output distribution with computational efficiency, allowing for the generation of long sequences of text with difficult-to-satisfy constraints, with less amplification of low probability outputs compared to existing methods. We show through a series of experiments that the task-specific performance of our method is comparable to methods that do not distort the output distribution, while being much more computationally efficient.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?