Abstract:The problem of fault classification in industry has been studied extensively. Most classification algorithms are modeled on the premise of data balance. However, the difficulty of collecting industrial data in different modes is quite different. This inevitably leads to data imbalance, which will adversely affect the fault classification performance. This article proposes a novel data augmentation classifier (DAC) for imbalanced fault classification. Data augmentation based on generative adversarial networks (GANs) is an effective way to solve the problem of unbalanced classification. However, the randomness of the GAN generation process restricts the effect of data enhancement. DAC proposes a data selection strategy based on data filtering and data purification in model training to solve this problem. In addition, DAC combines supervised learning and data generation processes to obtain an end-to-end model. Meanwhile, multigenerator structure of DAC (MDAC) is proposed to solve the problem of incomplete learning of a single generator when data imbalances get complicated. The proposed DAC and MDAC are applied in two fault classification cases of the Tennessee Eastman (TE) benchmark process, results of which show superiority of DAC and MDAC compared to existing methods. Note to Practitioners-Data imbalances are common in fault classification and affect the effectiveness of modeling in industry. As a generative model, generative adversarial networks (GANs) provide new ideas for small-class data augmentation. However, the instability of its training process and the randomness of data generation affect the results of data augmentation. In this article, the GAN generation process is analyzed in detail. The results of the visualization indicate that no data generation was perfect at any one time. Based on the rules of GAN data generation, we propose a data selection strategy during training. High-quality data are selected for data augmentation through data filtering and data purification. Apart from this, we combine the training process of GAN and classification model for imbalanced data to reduce modeling time. Through industrial examples, we have evaluated the effectiveness of this method.

AEGAN-Pathifier: A Data Augmentation Method to Improve Cancer Classification for Imbalanced Gene Expression Data

A Data Augmentation Method Based on Generative Adversarial Networks for Grape Leaf Disease Identification

Cancer Classification with Data Augmentation Based on Generative Adversarial Networks

Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data

CEGAN: Classification Enhancement Generative Adversarial Networks for unraveling data imbalance problems

Enhancing Histopathological Image Classification Performance through Synthetic Data Generation with Generative Adversarial Networks

Increasing prediction accuracy of pathogenic staging by sample augmentation with a GAN

Ensemble Data Augmentation for Imbalanced Fault Diagnosis.

A tutorial on generative adversarial networks with application to classification of imbalanced data

Data Augmentation Classifier for Imbalanced Fault Classification

Synthetic augmentation for semantic segmentation of class imbalanced biomedical images: A data pair generative adversarial network approach

Breast Cancer Histopathological Image Classification with Adversarial Image Synthesis

Synthetic Boosted Resampling Using Deep Generative Adversarial Networks: A Novel Approach to Improve Cancer Prediction from Imbalanced Datasets

Intelligent phenotype-detection and gene expression profile generation with generative adversarial networks

phylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data

Imbalanced medical disease dataset classification using enhanced generative adversarial network

An intra-class distribution-focused generative adversarial network approach for imbalanced tabular data learning

An Autoencoder and Generative Adversarial Networks Approach for Multi-Omics Data Imbalanced Class Handling and Classification

KGA: Integrating KPCA and GAN for Microbial Data Augmentation

A Data Augmentation Methodology to Reduce the Class Imbalance in Histopathology Images

Generative Adversarial Network Based Data Augmentation to Improve Cervical Cell Classification Model.