Ultra-Low-Latency Edge Inference for Distributed Sensing

Zhanwei Wang,Anders E. Kalør,You Zhou,Petar Popovski,Kaibin Huang
2024-07-18
Abstract:There is a broad consensus that artificial intelligence (AI) will be a defining component of the sixth-generation (6G) networks. As a specific instance, AI-empowered sensing will gather and process environmental perception data at the network edge, giving rise to integrated sensing and edge AI (ISEA). Many applications, such as autonomous driving and industrial manufacturing, are latency-sensitive and require end-to-end (E2E) performance guarantees under stringent deadlines. However, the 5G-style ultra-reliable and low-latency communication (URLLC) techniques designed with communication reliability and agnostic to the data may fall short in achieving the optimal E2E performance of perceptive wireless systems. In this work, we introduce an ultra-low-latency (ultra-LoLa) inference framework for perceptive networks that facilitates the analysis of the E2E sensing accuracy in distributed sensing by jointly considering communication reliability and inference accuracy. By characterizing the tradeoff between packet length and the number of sensing observations, we derive an efficient optimization procedure that closely approximates the optimal tradeoff. We validate the accuracy of the proposed method through experimental results, and show that the proposed ultra-Lola inference framework outperforms conventional reliability-oriented protocols with respect to sensing performance under a latency constraint.
Numerical Analysis
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to achieve ultra - low - latency (ultra - LoLa) inference in the Integrated Sensing and Edge - AI (ISEA) system in the sixth - generation (6G) network to meet the needs of applications with strict requirements for end - to - end (E2E) performance. Specifically: 1. **Application Requirements**: Many application scenarios, such as autonomous driving and industrial manufacturing, are very sensitive to latency and need to ensure E2E performance within strict time limits. However, the existing 5G - style Ultra - Reliable Low - Latency Communication (URLLC) technology has deficiencies in optimizing the E2E performance of data - sensing systems. 2. **Limitations of Existing Technologies**: Traditional URLLC technologies mainly focus on communication reliability and ignore the quality of data sensing. This results in the inability to achieve optimal E2E performance under strict time - limit requirements, especially in tasks requiring high - precision sensing. 3. **Research Objectives**: This paper proposes a new ultra - LoLa inference framework, aiming to optimize the E2E sensing accuracy in distributed sensing by jointly considering communication reliability and inference accuracy. Specifically, the framework derives an efficient optimization method to approach the optimal trade - off point by analyzing the trade - off between packet length and the number of sensing observations. 4. **Key Issues**: - How to overcome the communication bottleneck caused by short - packet transmission (SPT), especially in the case of high - dimensional feature transmission. - How to design new SPT technologies to evaluate communication reliability based on E2E sensing performance indicators rather than relying solely on the decoding error probability. 5. **Solutions**: The paper proposes an ultra - LoLa inference framework of a multi - view convolutional neural network (MVCNN) architecture, combined with wireless connections, to achieve efficient feature extraction and aggregation from distributed sensors to edge servers. By optimizing the packet length, the E2E sensing accuracy within the given task completion deadline is maximized. In summary, the core problem of this paper is how to achieve ultra - low - latency distributed sensing tasks in the 6G network environment by optimizing the trade - off between communication reliability and sensing quality, thereby meeting strict requirements for real - time performance and accuracy.