Dynamic Encoding and Decoding of Information for Split Learning in Mobile-Edge Computing: Leveraging Information Bottleneck Theory

Omar Alhussein,Moshi Wei,Arashmid Akhavain
DOI: https://doi.org/10.48550/arXiv.2309.02787
2023-09-06
Abstract:Split learning is a privacy-preserving distributed learning paradigm in which an ML model (e.g., a neural network) is split into two parts (i.e., an encoder and a decoder). The encoder shares so-called latent representation, rather than raw data, for model training. In mobile-edge computing, network functions (such as traffic forecasting) can be trained via split learning where an encoder resides in a user equipment (UE) and a decoder resides in the edge network. Based on the data processing inequality and the information bottleneck (IB) theory, we present a new framework and training mechanism to enable a dynamic balancing of the transmission resource consumption with the informativeness of the shared latent representations, which directly impacts the predictive performance. The proposed training mechanism offers an encoder-decoder neural network architecture featuring multiple modes of complexity-relevance tradeoffs, enabling tunable performance. The adaptability can accommodate varying real-time network conditions and application requirements, potentially reducing operational expenditure and enhancing network agility. As a proof of concept, we apply the training mechanism to a millimeter-wave (mmWave)-enabled throughput prediction problem. We also offer new insights and highlight some challenges related to recurrent neural networks from the perspective of the IB theory. Interestingly, we find a compression phenomenon across the temporal domain of the sequential model, in addition to the compression phase that occurs with the number of training epochs.
Machine Learning,Networking and Internet Architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in mobile edge computing, how to dynamically adjust the amount of information in the shared latent representation through the Information Bottleneck (IB) theory to balance the consumption of transmission resources and prediction performance. Specifically, the paper aims to address the following challenges: 1. **Changes in network conditions and application requirements**: Existing prediction networks and application functions based on split learning face time - varying usage behaviors and traffic patterns on the network substrate. In addition, different applications and services have different requirements for Quality of Service (QoS) and prediction requirements. 2. **Adjustment of the amount of information in encoded data**: It is necessary to dynamically adjust the amount of information in encoded data according to network conditions and application requirements to ensure communication efficiency and flexibility in the case of variable network conditions. 3. **Improving prediction performance**: By introducing the IB theory, find the optimal trade - off between input data compression and retaining relevant task information, thereby improving prediction performance. ### Solution To solve the above problems, the paper proposes a new framework and training mechanism that uses the IB theory to dynamically balance the consumption of transmission resources and the amount of information in the latent representation. Specific methods include: - **Adaptive encoding and decoding framework**: Build an adaptive neural network encoding and decoding framework that can adjust the trade - off between complexity and relevance according to network conditions and application requirements. - **Multi - mode complexity - relevance trade - off**: Provide multiple complexity - relevance modes, enabling the network to flexibly adjust performance under different conditions. - **Dynamic selection of latent representation**: By selecting different hidden - layer outputs as the latent representation for transmission, the purpose of dynamically adjusting the amount of information is achieved. ### Application example As a proof of concept, the paper applies the proposed training mechanism to the millimeter - wave (mmWave) 5G throughput prediction problem and conducts experiments using the Lumos5G dataset. The experimental results show that this method can effectively adjust the amount of information in the latent representation under different network conditions, thereby optimizing prediction performance. ### Main findings - **Compression phenomenon in the time dimension**: The study found that in the sequence model, the compression phenomenon not only occurs during the training process but also in the time dimension. - **Challenges in mutual information estimation**: When dealing with sequence models, estimating mutual information (MI) faces great challenges, especially when the number of hidden states is large. For this reason, methods such as using conditional mutual information are proposed to evaluate redundancy and reduce the complexity of estimation. Through these methods, the paper provides new ideas and solutions for split learning in mobile edge computing, which helps to improve the flexibility and prediction performance of the network.