Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases

Xiang Zhang,Khatoon Khedri,Reza Rawassizadeh
2024-04-13
Abstract:Large Language Models (LLMs) can automate or substitute different types of tasks in the software engineering process. This study evaluates the resource utilization and accuracy of LLM in interpreting and executing natural language queries against traditional SQL within relational database management systems. We empirically examine the resource utilization and accuracy of nine LLMs varying from 7 to 34 Billion parameters, including Llama2 7B, Llama2 13B, Mistral, Mixtral, Optimus-7B, SUS-chat-34B, platypus-yi-34b, NeuralHermes-2.5-Mistral-7B and Starling-LM-7B-alpha, using a small transaction dataset. Our findings indicate that using LLMs for database queries incurs significant energy overhead (even small and quantized models), making it an environmentally unfriendly approach. Therefore, we advise against replacing relational databases with LLMs due to their substantial resource utilization.
Databases,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
This paper investigates whether large-scale language models (LLMs) can replace traditional relational database management systems (RDBMS) for executing natural language queries. The study compares nine different scales of LLMs, ranging from 700 million to 3.4 billion parameters, with traditional SQL in terms of resource utilization and accuracy. The results indicate that using LLMs for database queries leads to significant energy consumption, even for small and quantized models, making this approach environmentally unfriendly. Therefore, the paper suggests not replacing relational databases with LLMs due to their resource-intensive nature. The paper also points out that LLMs have lower accuracy and suffer from factual errors and neural network-induced illusions when processing natural language queries. Despite efforts to improve the factual correctness and coverage of LLMs, as well as addressing token size limitations, this research focuses on benchmarking resource utilization. Experimental results show that even small-scale LLMs consume more energy than native SQL engines and perform worse in terms of accuracy. The paper emphasizes that while larger models may improve accuracy in the future, the energy issue still persists. Additionally, the paper reviews relevant literature and discusses the energy and water consumption issues of AI models. In conclusion, the problem addressed by this paper is whether LLMs can be used as alternatives to SQL for database queries and whether it is feasible in terms of resource utilization and accuracy. The research finds that due to low energy efficiency and inadequate accuracy, current LLMs are not suitable replacements for traditional SQL queries.