Robert Spencer,Surangika Ranathunga,Mikael Boulic,Andries,van Heerden,Teo Susnjak
Abstract:This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting. Transformers are a relatively new deep learning architecture, which has served as the foundation for groundbreaking technologies such as ChatGPT. While TL has been studied in the past, these studies considered either one TL strategy or used older deep learning models such as Recurrent Neural Networks or Convolutional Neural Networks. Here, we carry out an extensive empirical study on six different TL strategies and analyse their performance under varying feature spaces. In addition to the vanilla Transformer architecture, we also experiment with Informer and PatchTST, specifically designed for time series forecasting. We use 16 datasets from the Building Data Genome Project 2 to create building energy consumption forecasting models. Experiment results reveal that while TL is generally beneficial, especially when the target domain has no data, careful selection of the exact TL strategy should be made to gain the maximum benefit. This decision largely depends on the feature space properties such as the recorded weather features. We also note that PatchTST outperforms the other two Transformer variants (vanilla Transformer and Informer). We believe our findings would assist researchers in making informed decision in using TL and transformer architectures for building energy consumption forecasting.
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve
This paper aims to explore the effectiveness of Transfer Learning (TL) applied to transformer architectures for building energy consumption prediction. Specifically, the paper seeks to answer the following research questions:
1. **How do different data-driven transfer learning strategies compare in performance when applied to the base transformer architecture for building energy consumption prediction?**
- The researchers tested six different data-driven transfer learning strategies (1S → T, MS → T, 1S + T → T, MS + T → T, 1S + T → FT → T, and MS + T → FT → T) to evaluate their performance in different scenarios.
2. **How do specific characteristics of building energy consumption datasets (such as building type, climate zone, data volume, etc.) affect the effectiveness of different data-driven transfer learning strategies on the base transformer architecture?**
- By analyzing the characteristics of different datasets, such as climate zone, weather features, data volume, and time range, the researchers aim to understand the impact of these features on transfer learning strategies.
3. **How do different data-driven transfer learning strategies perform when applied to advanced transformer architectures specifically designed for time series prediction compared to the base transformer architecture?**
- In addition to the base transformer architecture, the researchers also tested two advanced transformer variants: Informer and PatchTST, to evaluate their performance in building energy consumption prediction.
### Background and Motivation
- **Background**: Globally, reducing carbon emissions in the building sector has become a crucial measure to combat climate change. Building energy consumption accounts for about one-third of global greenhouse gas emissions, making accurate prediction of building energy consumption essential for achieving economic feasibility, environmental sustainability, and occupant comfort.
- **Existing Methods**: Traditional methods for predicting building energy consumption include engineering calculations, numerical simulations, and data-driven modeling. Data-driven modeling has gained popularity in recent years due to its high accuracy in dynamic and complex energy consumption patterns, especially with the application of deep learning techniques (such as RNN, LSTM, and CNN).
- **Problem**: Despite the excellent performance of deep learning techniques in building energy consumption prediction, they typically require large amounts of training data. However, for some new buildings or buildings with insufficient data records, obtaining large amounts of data is unrealistic. Transfer learning can leverage existing large datasets to improve the prediction accuracy for target buildings.
### Research Contributions
- **Comprehensive Analysis of Transfer Learning Strategies**: The researchers conducted large-scale experiments covering 16 datasets from the Building Data Genome Project 2, testing six data-driven transfer learning strategies, making it one of the most comprehensive studies to date.
- **Impact Analysis of Dataset Characteristics**: The study analyzed the impact of factors such as climate zone, weather features, data volume, and time range on transfer learning strategies.
- **Application of Advanced Transformer Variants**: In addition to the base transformer architecture, the study included experiments with two advanced transformer variants, Informer and PatchTST, providing a more comprehensive comparison.
Through these studies, the paper provides valuable references for researchers and practitioners in using transfer learning and transformer architectures for building energy consumption prediction.