Non-Linear Coded Computation for Distributed CNN Inference: A Learning-based Approach

Chunhui Xu,Xing Liu,Haoyu Wang,Mingjun Tang,Yiyu Liu
DOI: https://doi.org/10.1109/ICDCSW63686.2024.00027
2024-07-23
Abstract:Distributed inference enables multiple distributed computing devices to cooperatively perform statistical inference or data analysis tasks in order to reduce the inference delay. Due to the device heterogeneity in terms of processing capacities, some devices may experience slower processing speeds or even fail to compute the inference task. This leads to a larger inference delay or sometimes inference task failure. In this work, we propose a novel learning-based coded inference framework for convolutional neural networks (CNNs) by introducing redundant computing devices. By using an encoder to determine the inference inputs of redundant devices and an decoder to reconstruct the convolutional segment result of the CNNs, we show that our framework can significantly enhance the resilience and efficiency of distributed inference systems. By deploying a robust encoding-decoding scheme, our framework dynamically compensates for devices with sub-optimal performance or computational failures. This ensures the overall system to maintain high levels of accuracy and reduce inference latency, even in heterogeneous device environments. Our evaluation results reveal the profound impact of our proposed framework on enhancing distributed inference tasks for CNNs. Based on the experiments conducted on image classification across various realistic datasets, we have demonstrated that our framework maintains high accuracy levels, even in scenarios where some distributed devices encounter failures, validating the robustness of our proposed approach.
Computer Science,Engineering
What problem does this paper attempt to address?