Abstract:Internet-of-Things (IoT) has connected billions of devices to the Internet. These devices are already collecting zettabytes ($10^{21}$ ) of data. However, the current IoT framework suffers from limited sensor energy, communication bandwidth, and server storage. These limitations impede the ability to send all the sensor data to the server all the time. Compact smart sensors provide a way to address this challenge. As opposed to the conventional sense-and-transmit sensors, emerging smart sensors can collect data, extract features, derive local inferences, and transmit only inference outcomes and possibly some raw data associated with rare events instead of all the raw data. This can dramatically cut down on the amount of sensor data transmitted, and hence its communication energy and network traffic. However, edge or server inference models trained with conventional machine learning approaches do not account for the fact that the smart sensors in the system have already performed a local inference. These approaches need all the sensor data and hence only cater to the traditional sense-and-transmit paradigm. This undoes the energy benefits brought about by smart sensors. In this paper, we propose a hierarchical inference model for IoT applications based on hierarchical learning and local inferences. Our model is able to take advantage of inference already performed on smart sensors, while at the same time accommodating conventional sense-and-transmit sensors in the IoT system. It also generalizes sensor-level inference to inference at other edge nodes by exploiting the intrinsically sensor/edge-grouped IoT data structure. We train classifiers hierarchically, aligned with the sensor-edge-server IoT paradigm. We verify our approach with seven IoT applications, demonstrating that the model is accurate, efficient, and generally applicable. We derive four edge-level inference models and four server-level inference models for these applications. For the four edge-level inference models, we reduce the number of bits transmitted from the sensor by $3.2\times$ - $42.7\times$ while at the same time also improving the classification accuracy by 0.3-6.7 percent. For the four server-level inference models, we reduce the number of edge-to-server bits transmitted by $17\times$ - $60\times$ , with classification accuracy change in the $-0.4$- $+0.1$ percent range.

Hierarchical and Distributed Machine Learning Inference Beyond the Edge

Extendable Multi-Device Collaborative Pipeline Parallel Inference in the Edge-Cloud Scenario

The Case for Hierarchical Deep Learning Inference at the Network Edge

Hierarchical Federated Edge Learning with Adaptive Clustering in Internet of Things

A Hierarchical Inference Model for Internet-of-Things

Low Latency Deep Learning Inference Model for Distributed Intelligent IoT Edge Clusters

Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge

Exploring the Boundaries of On-Device Inference: When Tiny Falls Short, Go Hierarchical

Improving edge AI for industrial IoT applications with distributed learning using consensus

Improved Decision Module Selection for Hierarchical Inference in Resource-Constrained Edge Devices

Adaptive Early Exit of Computation for Energy-Efficient and Low-Latency Machine Learning over IoT Networks

Communication-Efficient Separable Neural Network for Distributed Inference on Edge Devices

Decentralized LLM Inference over Edge Networks with Energy Harvesting

Edge-device Collaborative Computing for Multi-view Classification

Towards Inference Delivery Networks: Distributing Machine Learning with Optimality Guarantees

Power Efficient Machine Learning Models Deployment on Edge IoT Devices

Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing

Distributed Learning in Wireless Networks: Recent Progress and Future Challenges

Distributed Inference in Resource-Constrained IoT for Real-Time Video Surveillance

Energy efficient distributed analytics at the edge of the network for IoT environments