Decoding with Limited Teacher Supervision Requires Understanding When to Trust the Teacher

Hyunjong Ok,Jegwang Ryu,Jaeho Lee
2024-10-03
Abstract:How can small-scale large language models (LLMs) efficiently utilize the supervision of LLMs to improve their generative quality? This question has been well studied in scenarios where there is no restriction on the number of LLM supervisions one can use, giving birth to many decoding algorithms that utilize supervision without further training. However, it is still unclear what is an effective strategy under the $\textit{limited supervision}$ scenario, where we assume that no more than a few tokens can be generated by LLMs. To this end, we develop an algorithm to effectively aggregate the small-scale LLM and LLM predictions on initial tokens so that the generated tokens can more accurately condition the subsequent token generation by small-scale LLM only. Critically, we find that it is essential to adaptively overtrust or disregard the LLM prediction based on the confidence of the small-scale LLM. Through our experiments on a wide range of models and datasets, we demonstrate that our method provides a consistent improvement over conventional decoding strategies. $\small$ $\textbf{Code:}$ <a class="link-external link-https" href="https://github.com/HJ-Ok/DecLimSup" rel="external noopener nofollow">this https URL</a>
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively use small - scale large language models (sLLM) to generate high - quality texts under the limited supervision of large language models (LLM). Specifically, the researchers focus on how to improve the quality of generated texts by combining the predictions of sLLM and LLM when there is only a small amount of LLM supervision. The paper mentions that under such constraints, the traditional strategy of over - trusting LLM is not always optimal, so new algorithms need to be developed to dynamically decide when to trust the teacher model (LLM) or the student model (sLLM), and to what extent. The main contributions of the paper include: 1. **Defining the sLLM decoding problem under limited supervision**, which is a research direction of practical importance. 2. **Discovering that under limited supervision conditions, the traditional over - trusting LLM strategy is often not optimal**. 3. **Proposing an entropy - based mechanism** for determining which side between sLLM and LLM should be over - trusted and to what extent, and verifying its effectiveness in multiple settings. Through these contributions, the paper provides new ideas and technical solutions for how to efficiently use large language models in resource - constrained environments.