On the application of transfer learning in prognostics and health management

Ramin Moradi,Katrina M. Groth
DOI: https://doi.org/10.48550/arXiv.2007.01965
2020-07-04
Abstract:Advancements in sensing and computing technologies, the development of human and computer interaction frameworks, big data storage capabilities, and the emergence of cloud storage and could computing have resulted in an abundance of data in the modern industry. This data availability has encouraged researchers and industry practitioners to rely on data-based machine learning, especially deep learning, models for fault diagnostics and prognostics more than ever. These models provide unique advantages, however, their performance is heavily dependent on the training data and how well that data represents the test data. This issue mandates fine-tuning and even training the models from scratch when there is a slight change in operating conditions or equipment. Transfer learning is an approach that can remedy this issue by keeping portions of what is learned from previous training and transferring them to the new application. In this paper, a unified definition for transfer learning and its different types is provided, Prognostics and Health Management (PHM) studies that have used transfer learning are reviewed in detail, and finally, a discussion on transfer learning application considerations and gaps is provided for improving the applicability of transfer learning in PHM.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper mainly explores the application of transfer learning (TL) in prognostics and health management (PHM). Specifically, the paper aims to solve the following key problems: 1. **Insufficient model generalization ability**: - In the field of PHM, traditional data - based machine learning models (especially deep - learning models) have poor generalization ability under different devices, settings, and working conditions. This means that when the working conditions or devices change slightly, these models may need to be retrained or fine - tuned. - **Formula representation**: Suppose we have a source domain \(D_S\) and a target domain \(D_T\), as well as the corresponding label spaces \(Y_S\) and \(Y_T\), where \(D_S=\{X_S, P(X_S)\}\) and \(D_T = \{X_T, P(X_T)\}\). If the data distributions of the source domain and the target domain are different, the performance of the model on the target domain may drop significantly. 2. **Scarcity of high - quality labeled data**: - In the field of PHM, it is very difficult to obtain high - quality labeled data containing fault information, especially for expensive and safety - critical systems. Many machines and systems cannot run to the fault state because this will lead to high costs or serious consequences. - **Formula representation**: Let \(T_S=\{Y_S, f_S(x)\}\) and \(T_T=\{Y_T, f_T(x)\}\) be the learning tasks of the source domain and the target domain respectively, where \(f_S(x)\) and \(f_T(x)\) are prediction functions. If the target domain lacks sufficient labeled data, it will be very difficult to directly train the target domain model. 3. **Improve the applicability and efficiency of the model**: - Transfer learning can reduce the need for a large amount of labeled data and improve the generalization ability and prediction accuracy of the model by extracting useful knowledge from the source domain and transferring it to the target domain. - **Formula representation**: The goal of transfer learning is to use the information of the source domain to improve the prediction function \(f_T(x)\) of the target domain, even if the tasks or data distributions of the source domain and the target domain are different. ### Solutions The paper proposes the following transfer learning methods to solve the above problems: 1. **Inductive Transfer Learning**: - The source task and the target task are different, but they can share feature extractors or other parameters. For example, transfer the low - level feature extractor of an image classification model to a structural health monitoring task. 2. **Transductive Transfer Learning**: - The tasks of the source domain and the target domain are the same, but the data distributions are different. By adjusting the model to adapt to the data distribution of the target domain, better performance can be obtained on the target domain. For example, use the adversarial training method to make the feature generator generate domain - invariant features. 3. **Unsupervised Transfer Learning**: - The tasks of the source domain and the target domain are different and there is no labeled data. By learning the common feature space between the source domain and the target domain, tasks such as anomaly detection can be carried out on the target domain. ### Conclusion By introducing transfer learning, the paper hopes to significantly improve the ability of fault diagnosis and prediction under limited data and information conditions, reduce the dependence on a large amount of labeled data, and improve the generalization ability of the model.