DIDS: A Distributed Inference Framework with Dynamic Scheduling Capability

Yuwei Yan,Yikun Hu,Qinyun Cai,Wangdong Yang,Kenli Li
DOI: https://doi.org/10.1016/j.future.2024.07.037
2025-01-01
Abstract:Distributed inference is a promising solution for deploying Deep Neural Network (DNN) applications in resource-constrained edge environments. However, due to the complexity and variability of edge scenarios, efficiently completing DNN inference remains challenging. While previous works have made significant progress in various aspects such as partition strategy, device diversity, and memory overhead, few consider the impact of environmental dynamics, particularly environmental workloads. This oversight renders existing solutions impractical. In this study, we propose DIDS, a distributed inference framework with dynamic scheduling capability that considers environmental dynamics. We first conduct a formal analysis of distributed inference and design two basic scheduling strategies: Re-Partition and Complete-Push. Then we introduce Dynamic Push, a runtime scheduler based on Complete-Push that aims to mitigate the impact of environmental workloads through runtime scheduling. The evaluation results show that DIDS can achieve up to 3.7x speedup compared to static distributed inference, in addition, DIDS is also able to achieve significant improvement compared to the state-of-the-art method.
What problem does this paper attempt to address?