Abstract:The problem of fault classification in industry has been studied extensively. Most classification algorithms are modeled on the premise of data balance. However, the difficulty of collecting industrial data in different modes is quite different. This inevitably leads to data imbalance, which will adversely affect the fault classification performance. This article proposes a novel data augmentation classifier (DAC) for imbalanced fault classification. Data augmentation based on generative adversarial networks (GANs) is an effective way to solve the problem of unbalanced classification. However, the randomness of the GAN generation process restricts the effect of data enhancement. DAC proposes a data selection strategy based on data filtering and data purification in model training to solve this problem. In addition, DAC combines supervised learning and data generation processes to obtain an end-to-end model. Meanwhile, multigenerator structure of DAC (MDAC) is proposed to solve the problem of incomplete learning of a single generator when data imbalances get complicated. The proposed DAC and MDAC are applied in two fault classification cases of the Tennessee Eastman (TE) benchmark process, results of which show superiority of DAC and MDAC compared to existing methods. Note to Practitioners-Data imbalances are common in fault classification and affect the effectiveness of modeling in industry. As a generative model, generative adversarial networks (GANs) provide new ideas for small-class data augmentation. However, the instability of its training process and the randomness of data generation affect the results of data augmentation. In this article, the GAN generation process is analyzed in detail. The results of the visualization indicate that no data generation was perfect at any one time. Based on the rules of GAN data generation, we propose a data selection strategy during training. High-quality data are selected for data augmentation through data filtering and data purification. Apart from this, we combine the training process of GAN and classification model for imbalanced data to reduce modeling time. Through industrial examples, we have evaluated the effectiveness of this method.

Detecting Multi-Type Self-Admitted Technical Debt with Generative Adversarial Network-Based Neural Networks

Neural Network-based Detection of Self-Admitted Technical Debt

An Exploratory Study on the Introduction and Removal of Different Types of Technical Debt in Deep Learning Frameworks

Ignnvd: A Novel Software Vulnerability Detection Model Based on Integrated Graph Neural Networks

Deep Learning and Data Augmentation for Detecting Self-Admitted Technical Debt

Ensemble Data Augmentation for Imbalanced Fault Diagnosis.

Data Augmentation Classifier for Imbalanced Fault Classification

Multiclass Classification for Self-Admitted Technical Debt Based on XGBoost

Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study

A Taxonomy of Self-Admitted Technical Debt in Deep Learning Systems

Self-Admitted Technical Debt Detection Approaches: A Decade Systematic Review

SATDAUG -- A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt

Data Balancing Improves Self-Admitted Technical Debt Detection

MAT: A Simple Yet Strong Baseline for Identifying Self-Admitted Technical Debt

Detecting Self-Admitted Technical Debts via Prompt-Based Method in Issue-Tracking Systems

An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software

Identifying self-admitted technical debt in open source projects using text mining

Generative Adversarial Classification Network with Application to Network Traffic Classification

GGT: Graph-Guided Testing for Adversarial Sample Detection of Deep Neural Network

A Survey of Defect Detection Applications Based on Generative Adversarial Networks

CL-GAN: A GAN-based Continual Learning Model for Generating and Detecting AGDs