Learn to Slice, Slice to Learn: Unveiling Online Optimization and Reinforcement Learning for Slicing AI Services

Amr Abo-eleneen,Menna Helmy,Alaa Awad Abdellatif,Aiman Erbad,Amr Mohamed,Mohamed Abdallah
2024-11-06
Abstract:In the face of increasing demand for zero-touch networks to automate network management and operations, two pivotal concepts have emerged: "Learn to Slice" (L2S) and "Slice to Learn" (S2L). L2S involves leveraging Artificial intelligence (AI) techniques to optimize network slicing for general services, while S2L centers on tailoring network slices to meet the specific needs of various AI services. The complexity of optimizing and automating S2L surpasses that of L2S due to intricate AI services' requirements, such as handling uncontrollable parameters, learning in adversarial conditions, and achieving long-term performance goals. This paper aims to automate and optimize S2L by integrating the two concepts of L2S and S2L by using an intelligent slicing agent to solve S2L. Indeed, we choose two candidate slicing agents, namely the Exploration and Exploitation (EXP3) and Deep Q-Network (DQN) from the Online Convex Optimization (OCO) and Deep Reinforcement Learning (DRL) frameworks, and compare them. Our evaluation involves a series of carefully designed experiments that offer valuable insights into the strengths and limitations of EXP3 and DQN in slicing for AI services, thereby contributing to the advancement of zero-touch network capabilities.
Networking and Internet Architecture
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to optimize and automate network slicing for AI services by integrating the concepts of "Learn to Slice" (L2S) and "Slice to Learn" (S2L) using intelligent slicing agents to meet complex AI service requirements.** Specifically, the paper focuses on the following aspects: 1. **Optimizing network slicing to meet the requirements of AI services**: - L2S focuses on using AI technology to optimize network slicing to adapt to general services. - S2L focuses on customizing network slicing for various AI services to meet their specific requirements, such as handling uncontrollable parameters, learning under adversarial conditions, and achieving long - term performance goals. 2. **Addressing the unique challenges of AI services**: - **Resource allocation and hyperparameter selection**: Find the optimal combination of resources and hyperparameters for different model architectures, complexity, and goals while minimizing the impact on other services. - **Data quality and generalization ability**: Consider multiple uncontrollable parameters that affect model accuracy and generalization, such as data quality (independent and identically distributed, balanced, unique, and complete, etc.). - **Adaptability**: Adapt to different models and data requirements, learn to operate in potentially untrusted environments (such as under adversarial attacks), and follow long - term goals (such as reducing costs or maintaining component reliability). - **Calculating model accuracy**: Considering that model accuracy is the result after applying specific AI hyperparameters and data quality, it has randomness and diversity. 3. **Achieving the capabilities of zero - touch networks**: - By automatically managing and orchestrating the computing and network resources of the network, support multiple services running in parallel on the same physical infrastructure, thereby improving network management efficiency. To address these challenges, the paper proposes using two intelligent slicing agents - the EXP3 algorithm from the online convex optimization (OCO) framework and the Deep Q - Network (DQN) algorithm from the deep reinforcement learning (DRL) framework. These two algorithms have different characteristics and advantages respectively and can better support network slicing performance, especially when dealing with various AI services. ### Formula summary - **Utility function**: \[ U(t)=\sum_{m = 1}^{M}A_m(N(t)_m,k(t)_m)\times q(t)_m \] where: - \(A_m\) represents the accuracy of model \(m\), which is a function of the data length \(N(t)_m\) and the number of training rounds \(k(t)_m\). - \(q(t)_m\) represents the uncontrollable data quality received from the user - provided dataset. - **Decision variables**: - The proportion of the size of the user dataset. - The data rate for transmitting data from the user to the edge node. - The edge CPU frequency used for model training. - The number of training rounds. ### Experimental evaluation The paper evaluates the performance of EXP3 and DQN in AI service slicing through a series of carefully designed experiments and explores their advantages and disadvantages in the following aspects: 1. **Convergence and adaptation to new environments**. 2. **Learning in the presence of adversaries (adversarial resilience)**. 3. **Applicability in real - time scenarios (learning/execution cost)**. 4. **Compliance with long - term goals**, such as maintaining the reliability of hardware components for as long as possible. Through these experiments, the paper provides valuable insights for optimizing and automating S2L and promotes the development of zero - touch network capabilities.