The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

Tianyu Ding,Tianyi Chen,Haidong Zhu,Jiachen Jiang,Yiqi Zhong,Jinxin Zhou,Guangzhi Wang,Zhihui Zhu,Ilya Zharkov,Luming Liang

2024-04-19

Abstract:The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains, reshaping the artificial general intelligence landscape. However, the increasing computational and memory demands of these models present substantial challenges, hindering both academic research and practical applications. To address these issues, a wide array of methods, including both algorithmic and hardware solutions, have been developed to enhance the efficiency of LLMs. This survey delivers a comprehensive review of algorithmic advancements aimed at improving LLM efficiency. Unlike other surveys that typically focus on specific areas such as training or model compression, this paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs. Specifically, it covers various topics related to efficiency, including scaling laws, data utilization, architectural innovations, training and tuning strategies, and inference techniques. This paper aims to serve as a valuable resource for researchers and practitioners, laying the groundwork for future innovations in this critical research area. Our repository of relevant references is maintained at url{<a class="link-external link-https" href="https://github.com/tding1/Efficient-LLM-Survey" rel="external noopener nofollow">this https URL</a>}.

Computation and Language

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper "The Efficiency Spectrum of Large Language Models: An Algorithmic Review" aims to address the significant challenges of computational and memory resource demands posed by large language models (LLMs). As the scale of LLMs grows rapidly, these models play increasingly important roles in various fields, but their high computational costs and memory requirements limit their widespread use in academic research and practical applications. To tackle these issues, the paper provides a comprehensive review of various algorithmic approaches to improve the efficiency of LLMs. Specifically, the paper covers the following aspects: 1. **Data Utilization**: Exploring how to optimize data utilization to reduce resource consumption without affecting model performance. 2. **Architecture Design**: Reviewing innovative architecture designs and analyzing how architecture impacts efficiency. 3. **Training and Tuning Strategies**: Discussing strategies for efficiently training LLMs from scratch and fine-tuning pre-trained models for specific downstream tasks. 4. **Inference Techniques**: Exploring model compression techniques to accelerate inference speed and reduce memory usage. Through this multi-dimensional analysis, the paper aims to provide researchers and practitioners with a comprehensive resource, laying the foundation for future innovations in this critical research area.

The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

Efficient Large Language Models: A Survey

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

A Survey on Efficient Inference for Large Language Models

Efficient Multimodal Large Language Models: A Survey

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Efficient Training of Large Language Models on Distributed Infrastructures: A Survey

A Survey on Evaluation of Large Language Models

Model Compression and Efficient Inference for Large Language Models: A Survey

A Survey on Evaluation of Large Language ModelsJust Accepted

A Survey on Model Compression for Large Language Models

A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models

Search for Efficient Large Language Models

A Survey on Hardware Accelerators for Large Language Models

A Survey of Large Language Models

A Systematic Survey on Large Language Models for Algorithm Design

A Survey of Resource-efficient LLM and Multimodal Foundation Models

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery