Towards Resource-Efficient Federated Learning in Industrial IoT for Multivariate Time Series Analysis

Alexandros Gkillas,Aris Lalos
2024-11-06
Abstract:Anomaly and missing data constitute a thorny problem in industrial applications. In recent years, deep learning enabled anomaly detection has emerged as a critical direction, however the improved detection accuracy is achieved with the utilization of large neural networks, increasing their storage and computational cost. Moreover, the data collected in edge devices contain user privacy, introducing challenges that can be successfully addressed by the privacy-preserving distributed paradigm, known as federated learning (FL). This framework allows edge devices to train and exchange models increasing also the communication cost. Thus, to deal with the increased communication, processing and storage challenges of the FL based deep anomaly detection NN pruning is expected to have significant benefits towards reducing the processing, storage and communication complexity. With this focus, a novel compression-based optimization problem is proposed at the server-side of a FL paradigm that fusses the received local models broadcast and performs pruning generating a more compressed model. Experiments in the context of anomaly detection and missing value imputation demonstrate that the proposed FL scenario along with the proposed compressed-based method are able to achieve high compression rates (more than $99.7\%$) with negligible performance losses (less than $1.18\%$ ) as compared to the centralized solutions.
Machine Learning,Artificial Intelligence,Signal Processing
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on resource - efficient federated learning for multivariate time - series analysis in the Industrial Internet of Things (IIoT). Specifically, the paper aims to address the following key challenges: 1. **Outliers and Missing Data Problems**: - In industrial applications, outliers and missing data are a tough problem. Although deep learning has made significant progress in anomaly detection, these models usually require a large amount of computing and storage resources and may introduce privacy issues. 2. **Privacy Protection and Distributed Learning**: - The data collected by edge devices contains user privacy, so it is not feasible to directly share this data. Federated learning (FL), as a privacy - protected distributed learning paradigm, can perform model training without sharing private data, but it will increase communication costs. 3. **Resource Efficiency**: - In order to reduce the communication, processing, and storage costs in federated learning, especially for resource - constrained edge devices, compression techniques such as pruning can significantly reduce the size and complexity of the model while maintaining performance. 4. **Processing of Multivariate Time - Series Data**: - Different edge devices may be equipped with different sensors and measure different physical quantities, resulting in different data feature spaces for each device. Traditional federated learning methods assume that all clients have the same feature space, which is not always feasible in real - world scenarios. To solve the above problems, the paper proposes the following innovations: - **Novel Compression Optimization Problem**: A compression - based optimization problem is proposed on the server side to fuse the local models received from edge devices and generate a more compact global model. - **Efficient ADMM Solver**: For the proposed compression optimization problem, the Alternating Direction Method of Multipliers (ADMM) is used for solving to handle the non - smooth \( l_1 \) regularization term. - **Local Model Update on the Edge Device Side**: The time - series data is processed by the sliding - window method to ensure that the local model remains consistent with the compressed global model during the training process. - **Mask Fine - Tuning Process**: The convergence of the global model is accelerated and its performance accuracy is improved by only updating non - zero weights. - **Application of Autoencoder Model**: The autoencoder is used for anomaly detection and data repair, especially suitable for multivariate time - series data. In summary, this paper is committed to developing an efficient federated learning framework that can handle multivariate time - series data in a resource - constrained Industrial Internet of Things environment while ensuring privacy protection and model performance.