An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation

Farah Alsafadi,Mahmoud Yaseen,Xu Wu

2024-10-25

Abstract:The confluence of ultrafast computers with large memory, rapid progress in Machine Learning (ML) algorithms, and the availability of large datasets place multiple engineering fields at the threshold of dramatic progress. However, a unique challenge in nuclear engineering is data scarcity because experimentation on nuclear systems is usually more expensive and time-consuming than most other disciplines. One potential way to resolve the data scarcity issue is deep generative learning, which uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data. In this way, one can significantly expand the dataset to train more accurate predictive ML models. In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models. We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data. Additionally, the DNN prediction uncertainties are quantified using Bayesian Neural Networks (BNN) and conformal prediction (CP) to assess the impact on predictive uncertainty reduction. To test the proposed methodology, we used TRACE simulations of steady-state void fraction data based on the NUPEC Boiling Water Reactor Full-size Fine-mesh Bundle Test (BFBT) benchmark. We found that augmenting the training dataset using VAEs has improved the DNN model's predictive accuracy, improved the prediction confidence intervals, and reduced the prediction uncertainties.

Machine Learning

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of data scarcity in the field of nuclear engineering due to high experimental costs and long time - consuming. Specifically, the author explores the use of deep generative models based on Variational Autoencoders (VAEs) for data augmentation to improve the accuracy of Deep Neural Networks (DNN) predictions and reduce prediction uncertainty. #### Core problems 1. **Data scarcity problem**: Experimental data in nuclear engineering is usually more expensive and time - consuming than in other disciplines, resulting in a limited amount of available data. 2. **Prediction accuracy and uncertainty**: In the case of data scarcity, how to improve the prediction accuracy of DNN models and reduce prediction uncertainty through data augmentation techniques. #### Solutions - **Data augmentation**: Use VAEs to generate synthetic data and expand the training data set. - **Evaluation methods**: - Compare the prediction accuracy of DNN models trained with original data and augmented data. - Use Bayesian Neural Networks (BNN) and Conformal Prediction (CP) to quantify and evaluate prediction uncertainty. #### Specific objectives 1. **Improve prediction accuracy**: Evaluate the prediction performance of DNN models by comparing training data sets augmented with different numbers of synthetic samples. 2. **Narrow the confidence interval**: Use the CP method to calculate the confidence interval of DNN predictions and analyze the change in the width of the confidence interval as the synthetic data increases. 3. **Reduce prediction uncertainty**: Use BNN to evaluate the uncertainty of DNN predictions under different training data sets. ### Paper structure 1. **Problem definition**: Introduce the data set used for training and the experimental setup. 2. **Methodology**: - VAEs are used to generate synthetic data. - CP is used to calculate the confidence interval of predictions. - BNN is used to quantify prediction uncertainty. 3. **Results**: Show the specific impact of data augmentation on DNN prediction accuracy and uncertainty. 4. **Discussion and conclusion**: Summarize the research findings and propose future research directions. Through these methods, the paper attempts to prove the effectiveness and potential application value of data augmentation techniques in the field of nuclear engineering.

An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation

Towards Low-Budget Energy Efficiency Design in Additive Manufacturing Based on Variational Scale-Aware Transformer

Deep Generative Modeling-based Data Augmentation with Demonstration using the BFBT Benchmark Void Fraction Datasets

Quantification of Deep Neural Network Prediction Uncertainties for VVUQ of Machine Learning Models

Uncertainty‐aware Nuclear Power Turbine Vibration Fault Diagnosis Method Integrating Machine Learning and Heuristic Algorithm

Predicting Critical Heat Flux with Uncertainty Quantification and Domain Generalization Using Conditional Variational Autoencoders and Deep Neural Networks

Reconstruction and Fast Prediction of 3D Heat and Mass Transfer Based on a Variational Autoencoder

A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions

Machine Learning-Driven Reactor Pressure Vessel Embrittlement Prediction Model

Enhancement of Reflood Test Prediction by Integrating Machine Learning and Data Assimilation Technique

Empirical Models for Multidimensional Regression of Fission Systems

Novel polynomial Abet data augmentation algorithm with GRU paradigm for nuclear power prediction

Towards robust prediction of material properties for nuclear reactor design under scarce data -- a study in creep rupture property

Augmentation of scarce data—A new approach for deep-learning modeling of composites

Augmentation of scarce data -- a new approach for deep-learning modeling of composites

An analysis of machine learning for the safety justification of VVER reactors

Predictions and Uncertainty Estimates of Reactor Pressure Vessel Steel Embrittlement Using Machine Learning

Capturing Model Uncertainty with Data Augmentation in Deep Learning

Prediction and uncertainty quantification of SAFARI-1 axial neutron flux profiles with neural networks

Steam Turbine Anomaly Detection: An Unsupervised Learning Approach Using Enhanced Long Short-Term Memory Variational Autoencoder

Nuclear masses learned from a probabilistic neural network