Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks

Graziano A. Manduzio,Federico A. Galatolo,Mario G. C. A. Cimino,Enzo Pasquale Scilingo,Lorenzo Cominelli
2024-10-25
Abstract:Recent advancements in Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. While these models excel in general complex reasoning tasks, they still face challenges in mathematical problem-solving and logical reasoning. To address these limitations, researchers have explored function calling abilities, allowing LLMs to execute provided functions and utilize their outputs for task completion. However, concentrating on specific tasks can be very inefficient for large-scale LLMs to be used, because of the expensive cost of training and inference stages they need in terms of computational resources. This study introduces a novel framework for training smaller language models in function calling, focusing on specific logical and mathematical reasoning tasks. The approach aims to improve performances of small-scale models for these tasks using function calling, ensuring a high level of accuracy. Our framework employs an agent that, given a problem and a set of callable functions, queries the LLM by injecting a description and examples of the usable functions into the prompt and managing their calls in a step-by-step reasoning chain. This process is used to create a dataset of correct and incorrect reasoning chain chat completions from a large-scale LLM. This dataset is used to train a smaller LLM using Reinforcement Learning from Human Feedback (RLHF), specifically employing the Direct Preference Optimization (DPO) technique. Experimental results demonstrate how the proposed approach balances the trade-off between model size and performance, improving the ability of function calling for reasoning tasks, in smaller models.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiencies of large - language models (LLMs) in mathematical problem - solving and logical reasoning, especially their inefficiency in specific tasks. Specifically: 1. **Challenges in Mathematical Problem - Solving and Logical Reasoning**: Although large - language models perform well in natural - language understanding and generation, they still have difficulties in mathematical problem - solving and logical reasoning tasks. 2. **Resource Consumption Problem**: Large - language models require a large amount of computational resources during the training and inference stages, which makes them inefficient and costly when handling specific tasks. 3. **Performance Improvement of Small - scale Models**: To overcome the above problems, researchers hope to develop a framework that can improve the performance of small - language models in specific logical and mathematical reasoning tasks and ensure high precision. For this purpose, the paper proposes a new framework to enhance the performance of small - language models in specific logical and mathematical reasoning tasks by introducing function - calling capabilities. The main steps of this framework include: - **Defining Tasks and Problems**: Select a series of logical and mathematical reasoning tasks as test objects. - **Defining Callable Functions**: Define a set of functions for each problem. These functions can help the model complete reasoning steps, control the reasoning chain, and verify intermediate and final results. - **Generating Datasets**: Use large - language models to generate datasets of correct and incorrect reasoning chains. - **Fine - Tuning Small - scale Models**: Utilize the generated datasets to fine - tune small - language models through reinforcement learning methods (especially the direct preference optimization (DPO) technique). Through this method, researchers aim to balance the trade - off between model size and performance, thereby reducing the demand for computational resources while maintaining efficient reasoning capabilities. The experimental results show that this framework significantly improves the performance of small - language models in specific reasoning tasks.