Usage-Specific Survival Modeling Based on Operational Data and Neural Networks

Olov Holmer,Mattias Krysander,Erik Frisk
2024-03-28
Abstract:Accurate predictions of when a component will fail are crucial when planning maintenance, and by modeling the distribution of these failure times, survival models have shown to be particularly useful in this context. The presented methodology is based on conventional neural network-based survival models that are trained using data that is continuously gathered and stored at specific times, called snapshots. An important property of this type of training data is that it can contain more than one snapshot from a specific individual which results in that standard maximum likelihood training can not be directly applied since the data is not independent. However, the papers show that if the data is in a specific format where all snapshot times are the same for all individuals, called homogeneously sampled, maximum likelihood training can be applied and produce desirable results. In many cases, the data is not homogeneously sampled and in this case, it is proposed to resample the data to make it homogeneously sampled. How densely the dataset is sampled turns out to be an important parameter; it should be chosen large enough to produce good results, but this also increases the size of the dataset which makes training slow. To reduce the number of samples needed during training, the paper also proposes a technique to, instead of resampling the dataset once before the training starts, randomly resample the dataset at the start of each epoch during the training. The proposed methodology is evaluated on both a simulated dataset and an experimental dataset of starter battery failures. The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models. The results also show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.
Machine Learning,Systems and Control
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve two key problems: 1. **Define the usage - specific survival function**: Existing survival models are often unable to fully consider the lifespan changes of components under different usage conditions. Therefore, a more general method is needed to define the usage - specific survival function, which can take into account the operational data collected within any time interval. 2. **Training methods for handling non - independent observational data**: When a data set contains multiple observations from the same individual, the standard maximum likelihood estimation method is no longer applicable because these data points are not independent. Therefore, a new training method needs to be developed to effectively handle this type of dependent data and ensure the accuracy and reliability of the model. ### Specific background and challenges - **High maintenance costs**: In many applications, maintenance costs account for a large part of the total system cost. In order to minimize unnecessary maintenance, accurately predicting the remaining useful life of components becomes crucial. - **Complexity of the degradation process**: The degradation process of components is usually complex and non - deterministic, and it is difficult to accurately predict their failure times. Therefore, statistical descriptions (such as survival models) are more useful than deterministic predictions. - **Limitations of existing methods**: - Although methods such as random survival forests are effective, they may not be applicable to specific application scenarios in some cases. - Neural network - based survival models perform well in predicting the remaining life, but lack specific methods on how to incorporate operational data into the model. - **Multi - observational data problem**: The methods mentioned in the existing literature can usually only make predictions at a specific point in time, or do not discuss how to handle multiple observations from the same individual. ### Methods proposed in the paper To solve the above problems, the paper proposes the following methods: 1. **Usage - specific survival modeling based on operational data**: By collecting and analyzing the operational data of components throughout their life cycles, a model that can predict the remaining life distribution of components is constructed. Specifically, the model is based on a neural network and is trained using continuously collected and stored data snapshots. 2. **Improvement of maximum likelihood training**: To address the problem of non - independent observational data, the paper proposes a method called "quasi - likelihood". Through resampling techniques, the data is homogeneously sampled at consistent time intervals between each individual, so that the standard maximum likelihood estimation method can be applied to this type of data. 3. **Dynamic resampling strategy**: To avoid the over - fitting problem caused by static resampling, the paper proposes a method of randomly resampling at the beginning of each training cycle. This method not only reduces the amount of training data but also improves the generalization ability of the model. ### Experimental verification The paper verifies the proposed methods through simulated data sets and experimental data sets (such as starting battery failure data). The results show that when the data is homogeneously sampled, this method can produce accurate survival models; and random resampling can improve the model performance while reducing the amount of training data. ### Summary The main contribution of the paper is to provide an effective usage - specific survival modeling method based on operational data and solve the training problem of handling non - independent observational data. This provides a more accurate and reliable method for maintenance planning and life prediction.