Mitigate noisy data for smart IoT via GAN based machine unlearning

Liu, Yang
DOI: https://doi.org/10.1007/s11432-022-3671-9
2024-02-08
Science China Information Sciences
Abstract:With the development of IoT applications, machine learning dramatically improves the utility of variable IoT systems such as autonomous driving. Although the pretrain-finetune framework can cope well with data heterogeneity in complex IoT scenarios, the data collected by sensors often contain unexpected noisy data, e.g., out-of-distribution (OOD) data, which leads to the reduced performance of fine-tuned models. To resolve the problem, this paper proposes MuGAN, a method that can mitigate the side-effect of OOD data via the generative adversarial network (GAN)-based machine unlearning. MuGAN follows a straightforward but effective idea to mitigate the performance loss caused by OOD data, i.e., "flashbacking" the model to the condition where OOD data are excluded from model training. To achieve the goal, we design an adversarial game, where a discriminator is trained to identify whether a sample belongs to the training set by observing the confidence score. Meanwhile, a generator (i.e., the target model) is updated to fool the discriminator into believing that the OOD data are not included in the training set but others do. The experimental results show that benefiting from the high unlearning rate (more than 90%) and retention rate (99%), MuGAN succeeds in lowering the model performance degradation caused by OOD data from 5.88% to 0.8%.
computer science, information systems,engineering, electrical & electronic
What problem does this paper attempt to address?