Abstract:The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of large language model training techniques and inference deployment technologies aligned with this emerging trend. The discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning. On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization. It also explores LLMs' utilization and provides insights into their future development.

What problem does this paper attempt to address?

This paper mainly discusses the training and inference techniques of large-scale language models (LLMs), as well as their future applications and development trends. With the emergence of tools like ChatGPT, the use of LLMs in the field of natural language processing has significantly increased, and low-cost training and deployment have become the focus of research. The paper reviews LLM training techniques from data preprocessing, training architecture, pre-training tasks, parallel training to model fine-tuning, and covers inference deployment techniques such as model compression, parallel computing, memory scheduling, and structural optimization. In addition, the paper discusses the utilization of LLMs and prospects for future development. The paper first introduces the importance of language modeling and its role in NLP tasks, and then describes in detail the evolution from statistical language models to neural language models, and then to pre-trained language models, especially the impact of Transformer architecture and large-scale pre-trained models like the GPT series. The paper points out that as the model scale expands, LLMs exhibit the "emergence" phenomenon, that is, they perform well in generating high-quality text, learning, and reasoning capabilities, and can even learn to perform tasks with a small number of examples. The main goal of the paper is to provide researchers with a comprehensive overview of LLM training and inference techniques to facilitate their knowledge preparation in developing, deploying, and applying LLMs. The content includes the basics of LLMs, details of the Transformer architecture (such as self-attention mechanism, encoder, and decoder), position embeddings, backgrounds and basic components of prompt learning. Finally, the paper discusses training strategies such as pre-training and fine-tuning, prompt-free fine-tuning, and lists commonly used datasets and preprocessing methods. In summary, this paper aims to provide a comprehensive guide for LLM researchers to help them understand and optimize the training and inference processes of these models in order to adapt to the constantly evolving field of natural language processing.

Understanding LLMs: A Comprehensive Overview from Training to Inference

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Large Language Models (LLMs): Deployment, Tokenomics and Sustainability

Distributed Training of Large Language Models

Summary of ChatGPT-Related Research and Perspective Towards the Future of Large Language Models

Large Language Models Meet NLP: A Survey

Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models

LLMs are Also Effective Embedding Models: An In-depth Overview

Supervised Knowledge Makes Large Language Models Better In-context Learners

A Survey of Large Language Models

Large Language Models: A Survey

New Solutions on LLM Acceleration, Optimization, and Application

A Survey on Efficient Inference for Large Language Models

Challenges and Contributing Factors in the Utilization of Large Language Models (LLMs)

Exploring the Frontiers of LLMs in Psychological Applications: A Comprehensive Review

Large Language Models as Data Preprocessors

Unlocking the potential: A comprehensive exploration of large language models in natural language processing

DB-GPT: Large Language Model Meets Database

MindLLM: Pre-training Lightweight Large Language Model from Scratch, Evaluations and Domain Applications

ChatGPT Alternative Solutions: Large Language Models Survey