Green Runner: A tool for efficient model selection from model repositories

Jai Kannan,Scott Barnett,Anj Simmons,Taylan Selvi,Luis Cruz
DOI: https://doi.org/10.48550/arXiv.2305.16849
2023-05-26
Abstract:Deep learning models have become essential in software engineering, enabling intelligent features like image captioning and document generation. However, their popularity raises concerns about environmental impact and inefficient model selection. This paper introduces GreenRunnerGPT, a novel tool for efficiently selecting deep learning models based on specific use cases. It employs a large language model to suggest weights for quality indicators, optimizing resource utilization. The tool utilizes a multi-armed bandit framework to evaluate models against target datasets, considering tradeoffs. We demonstrate that GreenRunnerGPT is able to identify a model suited to a target use case without wasteful computations that would occur under a brute-force approach to model selection.
Software Engineering,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve two main problems in the process of deep - learning model selection: 1. **Environmental impact and resource waste**: - The widespread use of deep - learning models has brought a significant environmental burden, because their running costs account for 80% - 90% of the total model costs, and inefficient use of computing resources will increase the carbon footprint. - Current model selection methods (such as benchmarking and brute - force search) either lead to the selection of sub - optimal models or waste a large amount of computing resources. 2. **Complexity and trade - offs in model selection**: - Model selection is not just about choosing the model with the highest performance, but also requires trade - offs according to specific application scenarios. For example, models running on drones need to consider hardware limitations, while models running in self - driving cars need low latency to ensure safety. - Existing model selection methods usually only focus on identifying the highest - performance models, ignoring the comprehensive trade - offs of quality metrics (such as accuracy, model size, complexity, etc.) in the selection process. To solve these problems, the author proposes a new tool named **GreenRunnerGPT**. This tool improves the model selection process in the following ways: - **Efficient model selection**: Using the multi - armed bandit framework, GreenRunnerGPT can efficiently evaluate multiple models and find the model that best suits the target use case within a limited budget. - **Trade - off quality metrics**: By using large - language - models (LLMs) to suggest weights for different quality metrics, thereby optimizing resource utilization and balancing various trade - offs. - **Reduce computing resource waste**: Compared with the brute - force search method, GreenRunnerGPT significantly reduces the consumption of computing resources, thereby reducing carbon emissions. In short, GreenRunnerGPT aims to improve the efficiency of model selection and reduce the impact on the environment through an intelligent model selection method. ### Formula representation The formulas involved in the paper are as follows: The formula for the reward function is: \[ \text{reward} = \text{accuracy} \times \text{weight}_{\text{acc}} - \left( \frac{\log(\text{size}) - \log(\text{min size})}{\log(\text{max size}) - \log(\text{min size})} \right) \times \text{weight}_{\text{size}} - \left( \frac{\log(\text{complexity}) - \log(\text{min complexity})}{\log(\text{max complexity}) - \log(\text{min complexity})} \right) \times \text{weight}_{\text{complexity}} \] where: - \(\text{accuracy}\) is the accuracy of the model. - \(\text{size}\) is the size of the model. - \(\text{complexity}\) is the complexity of the model. - \(\text{weight}_{\text{acc}}\), \(\text{weight}_{\text{size}}\) and \(\text{weight}_{\text{complexity}}\) are the weights of accuracy, size and complexity respectively. In this way, GreenRunnerGPT can dynamically adjust the weights of various quality metrics according to specific usage scenarios, so as to select suitable models more efficiently.