Data-Driven Building Load Prediction and Large Language Models: Comprehensive Overview

Yake Zhang,Dijun Wang,Guansong Wang,Peng Xu,Yihao Zhu
DOI: https://doi.org/10.1016/j.enbuild.2024.115001
IF: 7.201
2024-01-01
Energy and Buildings
Abstract:Building load forecasting is essential for optimizing the architectural design and managing energy efficiently, enhancing the performance of Heating, Ventilation, and Air Conditioning systems, and enhancing occupant comfort. With advancements in data science and machine learning, the focus on predicting building loads through data analysis has significantly intensified as a research domain. However, previous studies have typically faced challenges such as data scarcity, improper feature extraction methods, and weak model generalization capabilities. To gain a deeper understanding of these issues, a comprehensive review of data processing, feature selection, and model selection methods in previous research is conducted from the perspective of the entire load forecasting process. The aim is to identify the most suitable methods for each step of load forecasting to enhance prediction accuracy. This review surveys the research progress of statistical learning methods, traditional machine learning methods, deep learning methods, and hybrid methods in different application scenarios of building load prediction. Then, it emphasized the critical role of data preprocessing and focused on techniques like data fusion and transfer learning to overcome data shortages and bolster the models’ ability to generalize. Moreover, the obtainment of significant features from building characteristics, weather data, and operational statistics to boost prediction accuracy is explored. A notable contribution of this review is the proposed technical framework for EnergyPlus model generation using LLM-based Retrieval Augmented Generation (RAG) technology and room- level load prediction with Spatio-Temporal Graph Neural Networks. This framework utilize architectural design drawings to achieve an “end-to-end” prediction process, aiming to reduce the professional threshold of load prediction and provide technical support for fine-grained regulation of building operation. Exploratory experiment is conducted using a single-zone building model to verify the feasibility of LLM-generated EnergyPlus models, with IDF simulation file generation taking only 196 s. Room-level load forecasting with LLMs remains to be explored further. It is reasonable to believe that the methods proposed in this review hold promise for advancing data-driven building load forecasting technologies.
What problem does this paper attempt to address?