Resource-Efficient Sensor Fusion via System-Wide Dynamic Gated Neural Networks

Chetna Singhal,Yashuo Wu,Francesco Malandrino,Sharon Ladron de Guevara Contreras,Marco Levorato,Carla Fabiana Chiasserini
2024-10-22
Abstract:Mobile systems will have to support multiple AI-based applications, each leveraging heterogeneous data sources through DNN architectures collaboratively executed within the network. To minimize the cost of the AI inference task subject to requirements on latency, quality, and - crucially - reliability of the inference process, it is vital to optimize (i) the set of sensors/data sources and (ii) the DNN architecture, (iii) the network nodes executing sections of the DNN, and (iv) the resources to use. To this end, we leverage dynamic gated neural networks with branches, and propose a novel algorithmic strategy called Quantile-constrained Inference (QIC), based upon quantile-Constrained policy optimization. QIC makes joint, high-quality, swift decisions on all the above aspects of the system, with the aim to minimize inference energy cost. We remark that this is the first contribution connecting gated dynamic DNNs with infrastructure-level decision making. We evaluate QIC using a dynamic gated DNN with stems and branches for optimal sensor fusion and inference, trained on the RADIATE dataset offering Radar, LiDAR, and Camera data, and real-world wireless measurements. Our results confirm that QIC matches the optimum and outperforms its alternatives by over 80%.
Artificial Intelligence,Networking and Internet Architecture
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problem of how to optimize sensor fusion and deep neural network (DNN) architectures when supporting multiple AI - based applications in mobile systems. Specifically, the paper focuses on minimizing the cost (such as energy consumption) during the inference process while meeting the latency and quality (especially reliability) requirements of inference tasks. To achieve this goal, the paper proposes a new algorithmic strategy named **Quantile - constrained Inference (QIC)**. #### Main problems include: 1. **Selecting sensors / data sources**: Determine which sensors or data sources should be used for each application. 2. **DNN architecture design**: Select parts of the DNN architecture (such as stems and branches) suitable for specific application scenarios. 3. **Network node allocation**: Decide on which network nodes (such as mobile devices and edge servers) to deploy different parts of the DNN. 4. **Resource allocation**: Allocate computing and communication resources reasonably to ensure an efficient and low - energy - consumption inference process. #### Key contributions of the paper: - **Dynamic gated neural networks**: Introduced dynamic gated neural networks with branches, which can adaptively adjust the execution path according to the characteristics of the input data. - **QIC algorithm**: Based on quantile - constrained policy optimization (QCPO), QIC can minimize energy consumption while meeting the inference quality (such as accuracy) and latency requirements. - **Infrastructure - level decision - making**: Combine the internal structure of the DNN with infrastructure operations and optimize resource utilization by dynamically adjusting the network configuration. #### Application scenarios: The paper uses the RADIATE dataset, which contains radar, LiDAR, and camera data as well as actual wireless measurement data, to evaluate the performance of the QIC algorithm. The experimental results show that QIC not only significantly outperforms existing methods (more than 80% performance improvement) but also approaches the optimal solution. #### Summary: This paper solves the problem of how to improve AI inference efficiency through intelligent decision - making and resource optimization in a multi - sensor, multi - application environment. The method it proposes not only improves the accuracy and speed of inference but also significantly reduces energy consumption and is applicable to various real - world mobile and edge - computing scenarios.