Integrating Active Learning and Semi-Supervised Learning for Improved Data-Driven HVAC Fault Diagnosis Performance

Cheng Fan,Qiuting Wu,Yang Zhao,Like Mo
DOI: https://doi.org/10.1016/j.apenergy.2023.122356
IF: 11.2
2024-01-01
Applied Energy
Abstract:Data-driven methods have drawn increasing interests in HVAC fault diagnosis tasks due to their intrinsic advantages in making real-time automated decisions. To ensure the reliability of data-driven models, it is essential to prepare sufficient labeled data for predictive modeling. In practice, it can be very time-consuming and labor-intensive to determine the actual operating condition or label of each data sample (e.g., Normal or Faulty), making it highly challenging to develop robust data-driven solutions through conventional supervised learning methods. To tackle such challenges, this study proposes a data analytic framework to integrate active learning and semi-supervised learning to utilize massive unlabeled data for improved fault diagnosis performance. More specifically, five active learning methods have been tested to quantify their effectiveness in discovering valuable unlabeled data for expert labeling. Semi-supervised data-driven models have been developed to enable autonomous knowledge discovery from unlabeled building operational data through self-training protocols. Data experiments have been conducted to explore the separated and integrated values of active and semi-supervised learning. The results show that active learning can effectively identify valuable data samples for fault diagnosis and thereby, reducing approximately 50% labeling costs. Cost-effective combinatorial strategies have been derived to integrate active learning and semi-supervised learning for practical applications. The research outcomes are valuable for developing advanced data-driven solutions with substantial decreases in manual costs.
What problem does this paper attempt to address?