An Opportunistically Parallel Lambda Calculus for Performant Composition of Large Language Models

Stephen Mell,Steve Zdancewic,Osbert Bastani
2024-05-19
Abstract:Large language models (LLMs) have shown impressive results at a wide-range of tasks. However, they have limitations, such as hallucinating facts and struggling with arithmetic. Recent work has addressed these issues with sophisticated decoding techniques. However, performant decoding, particularly for sophisticated techniques, relies crucially on parallelization and batching, which are difficult for developers.
Programming Languages
What problem does this paper attempt to address?