Abstract:Short-term forecasting of residential and commercial building energy consumption is widely used in power systems and continues to grow in importance. Data-driven short-term load forecasting (STLF), although promising, has suffered from a lack of open, large-scale datasets with high building diversity. This has hindered exploring the pretrain-then-fine-tune paradigm for STLF. To help address this, we present BuildingsBench, which consists of: 1) Buildings-900K, a large-scale dataset of 900K simulated buildings representing the U.S. building stock; and 2) an evaluation platform with over 1,900 real residential and commercial buildings from 7 open datasets. BuildingsBench benchmarks two under-explored tasks: zero-shot STLF, where a pretrained model is evaluated on unseen buildings without fine-tuning, and transfer learning, where a pretrained model is fine-tuned on a target building. The main finding of our benchmark analysis is that synthetically pretrained models generalize surprisingly well to real commercial buildings. An exploration of the effect of increasing dataset size and diversity on zero-shot commercial building performance reveals a power-law with diminishing returns. We also show that fine-tuning pretrained models on real commercial and residential buildings improves performance for a majority of target buildings. We hope that BuildingsBench encourages and facilitates future research on generalizable STLF. All datasets and code can be accessed from <a class="link-external link-https" href="https://github.com/NREL/BuildingsBench" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the lack of large - scale and diverse public datasets in short - term building energy consumption prediction (Short - Term Load Forecasting, STLF). Specifically, the paper introduces **BuildingsBench**, which is a large - scale dataset and benchmarking platform consisting of two main parts: 1. **Buildings - 900K**: A time - series dataset containing 900,000 simulated buildings, representing the building stock in the United States. 2. **Evaluation platform**: Combines data from more than 1,900 actual residential and commercial buildings in 7 public datasets to evaluate model performance. ### Main research questions 1. **Zero - Shot Short - Term Load Forecasting (Zero - Shot STLF)**: - The model conducts short - term load forecasting on unseen buildings without any fine - tuning. - Research on the generalization ability of synthetic pre - trained models on actual commercial buildings. 2. **Transfer Learning**: - The pre - trained model is fine - tuned on the target building, assuming only limited 6 - month data. - Research on the impact of fine - tuning on model performance. ### Research background - **Importance of short - term load forecasting**: Short - term load forecasting plays an important role in the power system, helping to match energy supply and demand, optimize energy market pricing, and achieve building energy management through reinforcement learning and model predictive control. - **Existing challenges**: Although data - driven short - term load forecasting has potential, the lack of large - scale and diverse public datasets hinders the research on the pre - training - fine - tuning paradigm. ### Dataset characteristics - **Buildings - 900K**: Contains time - series data of 900,000 simulated buildings, covering residential and commercial buildings in different climate regions in the United States. - **Evaluation platform**: Combines multiple actual building datasets, providing diverse geographical distributions, years, and building types. ### Main findings - **Generalization ability of synthetic pre - trained models**: The performance of pre - trained models on actual commercial buildings is surprisingly good, especially in zero - sample prediction tasks. - **Impact of dataset size and diversity**: As the dataset size increases, the model performance follows a power - law relationship, but with diminishing returns. - **Effect of fine - tuning**: Fine - tuning pre - trained models on actual commercial and residential building data can significantly improve performance. ### Conclusions - **Contributions**: 1. **Buildings - 900K**: A simulated dataset for large - scale pre - training. 2. **Evaluation platform**: A benchmarking platform for zero - shot short - term load forecasting and transfer learning. 3. **Valuable insights into model pre - training**: Researched the application of pre - trained models in short - term load forecasting. ### Formula representation - **Normalized Root Mean Square Error (NRMSE)**: \[ \text{NRMSE}=100\times\frac{1}{\bar{y}}\sqrt{\frac{1}{24M}\sum_{j = 1}^{M}\sum_{i = 1}^{24}(y_{i,j}-\hat{y}_{i,j})^2} \] where \(\hat{y}\) is the predicted load, \(y\) is the actual load, and \(\bar{y}\) is the average of the actual loads for all \(M\) days. - **Rank Probability Score (RPS)**: \[ \text{RPS}=\int_{0}^{\infty}(\hat{F}_i(y)-1_{y_i\leq y})^2\,dy \] where \(\hat{F}_i(y)\) is the predicted cumulative distribution function, and \(1_{y_i\leq y}\) is an indicator function indicating whether the actual load \(y_i\) is less than or equal to \(y\). Through these studies, the paper hopes to encourage and promote future research in the field of general short - term load forecasting.

BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting