Abstract:We present a machine and deep learning method to offload trained deep learning model and transmit packets efficiently on resource-constrained internet of things (IoT) edge devices and networks. Recently, the types of IoT devices have become diverse and the volume of data has been increasing, such as images, voice, and time-series sensory signals generated by various devices. However, transmitting large amounts of data to a server or cloud becomes expensive owing to limited bandwidth, and leads to latency for time-sensitive operations. Therefore, we propose a novel offloading and transmission policy considering energy-efficiency, execution time, and the number of generated packets for resource-constrained IoT edge devices that run a deep learning model and a reinforcement learning method to find an optimal contention window size for effective channel access using a contention-based medium access control (MAC) protocol. A Reinforcement learning is used to improve the performance of the applied MAC protocol. Our proposed method determines the offload and transmission strategies that are better to directly send fragmented packets of raw data or to send the extracted feature vector or the final output of deep learning networks, considering the operation performance and power consumption of the resource-constrained microprocessor, as well as the power consumption of the radio transceiver and latency for transmitting the all the generated packets. In the performance evaluation, we measured the performance parameters of ARM Cortex-M4 and Cortex-M7 processors for the network simulation. The evaluation results show that our proposed adaptive channel access and learning-based offload and transmission methods outperform conventional role-based channel access schemes. They transmit packets of raw data and are effective for IoT edge devices and network protocols.

Adaptive Offloading of Transformer Inference for Weak Edge Devices with Masked Autoencoders

Masked Autoencoders for Point Cloud Self-supervised Learning.

Masked autoencoders are effective solution to transformer data-hungry

Optimizing the Deployment of Tiny Transformers on Low-Power MCUs

Exploring Approximation and Dataflow Co-Optimization for Scalable Transformer Inference Architecture on the Edge

AccEPT: an Acceleration Scheme for Speeding Up Edge Pipeline-parallel Training

ED-ViT: Splitting Vision Transformer for Distributed Inference on Edge Devices

Distributed Inference with Minimal Off-Chip Traffic for Transformers on Low-Power MCUs

LAMBO: Large AI Model Empowered Edge Intelligence

EASTER: Learning to Split Transformers at the Edge Robustly

Efficient Transformer Encoders for Mask2Former-style models

Efficient Deployment of Transformer Models in Analog In-Memory Computing Hardware

MCUFormer: Deploying Vision Tranformers on Microcontrollers with Limited Memory.

Energy-Efficient Edge Learning via Joint Data Deepening-and-Prefetching

Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

Offloading and Transmission Strategies for IoT Edge Devices and Networks

Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment

MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization

Bridging the Resource Gap: Deploying Advanced Imitation Learning Models onto Affordable Embedded Platforms

A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge