Edge Devices Inference Performance Comparison

R. Tobiasz,G. Wilczyński,P. Graszka,N. Czechowski,S. Łuczak

DOI: https://doi.org/10.5626/JCSE.2023.17.2.51

2023-06-21

Abstract:In this work, we investigate the inference time of the MobileNet family, EfficientNet V1 and V2 family, VGG models, Resnet family, and InceptionV3 on four edge platforms. Specifically NVIDIA Jetson Nano, Intel Neural Stick, Google Coral USB Dongle, and Google Coral PCIe. Our main contribution is a thorough analysis of the aforementioned models in multiple settings, especially as a function of input size, the presence of the classification head, its size, and the scale of the model. Since throughout the industry, those architectures are mainly utilized as feature extractors we put our main focus on analyzing them as such. We show that Google platforms offer the fastest average inference time, especially for newer models like MobileNet or EfficientNet family, while Intel Neural Stick is the most universal accelerator allowing to run most architectures. These results should provide guidance for engineers in the early stages of AI edge systems development. All of them are accessible at <a class="link-external link-https" href="https://bulletprove.com/research/edge_inference_results.csv" rel="external noopener nofollow">this https URL</a>

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to conduct a comparative study on the inference performance across various edge devices, specifically including the performance of models such as the MobileNet series, EfficientNet V1 and V2 series, VGG models, ResNet series, and InceptionV3 on four edge platforms (NVIDIA Jetson Nano, Intel Neural Stick, Google Coral USB Dongle, and Google Coral PCIe). #### Main Contributions: 1. **Comprehensive Analysis**: Investigated the impact of different input sizes, the presence and size of classification heads, and other factors on model inference time. 2. **Feature Extractors**: Focused on the performance of models as feature extractors, as these pre-trained models are primarily used as feature extractors. 3. **Performance Comparison**: Results showed that Google platforms provided the fastest average inference time, especially on newer models like the MobileNet and EfficientNet families; while the Intel Neural Stick was the most versatile accelerator, capable of running most architectures. 4. **Engineering Guidance**: Provided guidance for engineers in developing AI edge systems, helping them choose the appropriate platform and model. #### Research Background: - **Network Load**: Sending high-resolution data from numerous IoT devices to computing units may lead to unpredictable time delays. - **Computational Cost**: Using current state-of-the-art models to analyze high-resolution data may result in cost-inefficient systems. - **Security**: Sending raw data to the cloud may be susceptible to hacking or reduce user trust. By comparing the performance of different models on multiple edge devices, this study provides valuable data support for engineers, helping them make more efficient choices during the development process.

Edge Devices Inference Performance Comparison

Extendable Multi-Device Collaborative Pipeline Parallel Inference in the Edge-Cloud Scenario

Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge.

Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks

Benchmarking Edge AI Platforms for High-Performance ML Inference

Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models

Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing

pCAMP: Performance Comparison of Machine Learning Packages on the Edges

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

Benchmarking Deep Learning Models for Object Detection on Edge Computing Devices

Research on Convolutional Neural Network Inference Acceleration and Performance Optimization for Edge Intelligence

Fast Object Detection with a Machine Learning Edge Device

DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices

Exploring Deep Neural Networks on Edge TPU

Benchmarking Edge Computing Devices for Grape Bunches and Trunks Detection using Accelerated Object Detection Single Shot MultiBox Deep Learning Models

Edge-PRUNE: Flexible Distributed Deep Learning Inference

EdgeKE: An On-Demand Deep Learning IoT System for Cognitive Big Data on Industrial Edge Devices

Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices with AI Multi-Tenancy

Macro benchmarking edge devices using enhanced super-resolution generative adversarial networks (ESRGANs)

EdgeCI: Distributed Workload Assignment and Model Partitioning for CNN Inference on Edge Clusters

Demystifying TensorRT: Characterizing Neural Network Inference Engine on Nvidia Edge Devices