Abstract:This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting. Transformers are a relatively new deep learning architecture, which has served as the foundation for groundbreaking technologies such as ChatGPT. While TL has been studied in the past, these studies considered either one TL strategy or used older deep learning models such as Recurrent Neural Networks or Convolutional Neural Networks. Here, we carry out an extensive empirical study on six different TL strategies and analyse their performance under varying feature spaces. In addition to the vanilla Transformer architecture, we also experiment with Informer and PatchTST, specifically designed for time series forecasting. We use 16 datasets from the Building Data Genome Project 2 to create building energy consumption forecasting models. Experiment results reveal that while TL is generally beneficial, especially when the target domain has no data, careful selection of the exact TL strategy should be made to gain the maximum benefit. This decision largely depends on the feature space properties such as the recorded weather features. We also note that PatchTST outperforms the other two Transformer variants (vanilla Transformer and Informer). We believe our findings would assist researchers in making informed decision in using TL and transformer architectures for building energy consumption forecasting.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to explore the effectiveness of Transfer Learning (TL) applied to transformer architectures for building energy consumption prediction. Specifically, the paper seeks to answer the following research questions: 1. **How do different data-driven transfer learning strategies compare in performance when applied to the base transformer architecture for building energy consumption prediction?** - The researchers tested six different data-driven transfer learning strategies (1S → T, MS → T, 1S + T → T, MS + T → T, 1S + T → FT → T, and MS + T → FT → T) to evaluate their performance in different scenarios. 2. **How do specific characteristics of building energy consumption datasets (such as building type, climate zone, data volume, etc.) affect the effectiveness of different data-driven transfer learning strategies on the base transformer architecture?** - By analyzing the characteristics of different datasets, such as climate zone, weather features, data volume, and time range, the researchers aim to understand the impact of these features on transfer learning strategies. 3. **How do different data-driven transfer learning strategies perform when applied to advanced transformer architectures specifically designed for time series prediction compared to the base transformer architecture?** - In addition to the base transformer architecture, the researchers also tested two advanced transformer variants: Informer and PatchTST, to evaluate their performance in building energy consumption prediction. ### Background and Motivation - **Background**: Globally, reducing carbon emissions in the building sector has become a crucial measure to combat climate change. Building energy consumption accounts for about one-third of global greenhouse gas emissions, making accurate prediction of building energy consumption essential for achieving economic feasibility, environmental sustainability, and occupant comfort. - **Existing Methods**: Traditional methods for predicting building energy consumption include engineering calculations, numerical simulations, and data-driven modeling. Data-driven modeling has gained popularity in recent years due to its high accuracy in dynamic and complex energy consumption patterns, especially with the application of deep learning techniques (such as RNN, LSTM, and CNN). - **Problem**: Despite the excellent performance of deep learning techniques in building energy consumption prediction, they typically require large amounts of training data. However, for some new buildings or buildings with insufficient data records, obtaining large amounts of data is unrealistic. Transfer learning can leverage existing large datasets to improve the prediction accuracy for target buildings. ### Research Contributions - **Comprehensive Analysis of Transfer Learning Strategies**: The researchers conducted large-scale experiments covering 16 datasets from the Building Data Genome Project 2, testing six data-driven transfer learning strategies, making it one of the most comprehensive studies to date. - **Impact Analysis of Dataset Characteristics**: The study analyzed the impact of factors such as climate zone, weather features, data volume, and time range on transfer learning strategies. - **Application of Advanced Transformer Variants**: In addition to the base transformer architecture, the study included experiments with two advanced transformer variants, Informer and PatchTST, providing a more comprehensive comparison. Through these studies, the paper provides valuable references for researchers and practitioners in using transfer learning and transformer architectures for building energy consumption prediction.

Transfer Learning on Transformers for Building Energy Consumption Forecasting -- A Comparative Study

TFEformer: Temporal Feature Enhanced Transformer for Multivariate Time Series Forecasting

Hidformer: Hierarchical Dual-Tower Transformer Using Multi-Scale Mergence for Long-Term Time Series Forecasting

Small Sample Building Energy Consumption Prediction Using Contrastive Transformer Networks

Transformers for Energy Forecast

Transfer Learning in Transformer-Based Demand Forecasting For Home Energy Management System

Transfer Learning in Deep Learning Models for Building Load Forecasting: Case of Limited Data

Transformer-Based Model for Electrical Load Forecasting

Itransformer: Inverted Transformers Are Effective for Time Series Forecasting

Forecasting Energy Consumption of a Public Building Using Transformer and Support Vector Regression

Multi-Task Learning and Temporal-Fusion-Transformer-Based Forecasting of Building Power Consumption

A novel Transformer-based network forecasting method for building cooling loads

A Systematic Review for Transformer-based Long-term Series Forecasting

Transformer Training Strategies for Forecasting Multiple Load Time Series

Residential energy consumption forecasting using deep learning models

Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers

Implementation of a Long Short-Term Memory Transfer Learning (LSTM-TL)-Based Data-Driven Model for Building Energy Demand Forecasting

Transfer Learning in the Transformer Model for Thermal Comfort Prediction: A Case of Limited Data

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting

Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting