Hierarchical and Distributed Machine Learning Inference Beyond the Edge

Anthony Thomas,Yunhui Guo,Yeseong Kim,Baris Aksanli,Arun Kumar,Tajana S. Rosing
DOI: https://doi.org/10.1109/icnsc.2019.8743164
2019-05-01
Abstract:Networked applications with heterogeneous sensors are a growing source of data. Such applications use machine learning (ML) to make real-time predictions. Currently, features from all sensors are collected in a centralized cloud-based tier to form the whole feature vector for ML prediction. This approach has high communication cost, which wastes energy and often bottlenecks the network. In this work, we study an alternative approach that mitigates such issues by “pushing” ML inference computations out of the cloud and onto a hierarchy of IoT devices. Our approach presents a new technical challenge of “rewriting” an ML inference computation to factor it over a network of devices without significantly reducing prediction accuracy. We introduce novel exact factoring algorithms for some popular models that preserve accuracy. We also create novel approximate variants of other models that offer high accuracy. Measurements on a common IoT device show that energy use and latency can be reduced by up to 63% and 67% respectively without reducing accuracy relative to sending all data to the cloud.
What problem does this paper attempt to address?