Abstract:Due to increasing amounts of data and compute resources, deep learning achieves many successes in various domains. The application of deep learning on the mobile and embedded devices is taken more and more attentions, benchmarking and ranking the AI abilities of mobile and embedded devices becomes an urgent problem to be solved. Considering the model diversity and framework diversity, we propose a benchmark suite, AIoTBench, which focuses on the evaluation of the inference abilities of mobile and embedded devices. AIoTBench covers three typical heavy-weight networks: ResNet50, InceptionV3, DenseNet121, as well as three light-weight networks: SqueezeNet, MobileNetV2, MnasNet. Each network is implemented by three frameworks which are designed for mobile and embedded devices: Tensorflow Lite, Caffe2, Pytorch Mobile. To compare and rank the AI capabilities of the devices, we propose two unified metrics as the AI scores: Valid Images Per Second (VIPS) and Valid FLOPs Per Second (VOPS). Currently, we have compared and ranked 5 mobile devices using our benchmark. This list will be extended and updated soon after.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: With the successful application of deep learning in various fields, especially the increasing attention to its application on mobile and embedded devices, how to effectively benchmark and rank the AI inference capabilities of these devices has become an urgent problem to be solved. Specifically, the paper proposes solutions to the following problems: 1. **Model diversity**: Different neural network architectures have different trade - offs between accuracy and computational complexity, and no single network architecture can unify all design and application scenarios. Therefore, a benchmarking suite covering a variety of typical and lightweight network architectures is required to comprehensively evaluate the performance of different devices. 2. **Framework diversity**: There are currently many popular deep - learning frameworks (such as TensorFlow Lite, Caffe2, PyTorch Mobile, etc.), which provide different implementation methods and support levels on mobile and embedded devices. Therefore, a benchmarking tool that can compare the performance of different frameworks is required. 3. **Hardware acceleration support**: Modern mobile and embedded devices are usually equipped with hardware accelerators such as GPUs or NPUs to support AI applications. However, different devices have different levels of support for these accelerators, and a benchmarking method that can reflect this difference is required. To solve the above problems, the authors propose a benchmarking suite named AIoTBench, which focuses on evaluating the inference capabilities of mobile and embedded devices. AIoTBench covers three typical heavy - duty networks (ResNet50, InceptionV3, DenseNet121) and three lightweight networks (SqueezeNet, MobileNetV2, MnasNet), and each network is implemented in three frameworks specifically designed for mobile and embedded devices. In addition, in order to compare and rank the AI capabilities of different devices, the authors propose two unified metrics as AI scores: Valid Images Per Second (VIPS) and Valid FLOPs Per Second (VOPS). These two metrics reflect the trade - off between quality and performance in AI systems. In summary, this paper aims to help users better understand and evaluate the AI inference capabilities of mobile and embedded devices by providing a comprehensive and easy - to - use benchmarking tool.

Comparison and Benchmarking of AI Models and Frameworks on Mobile Devices

Deep Learning on Mobile and Embedded Devices: State-of-the-art, Challenges, and Future Directions

Close the Gap Between Deep Learning and Mobile Intelligence by Incorporating Training in the Loop

Explore Training of Deep Convolutional Neural Networks on Battery-powered Mobile Devices: Design and Application

AIBench: an Industry Standard AI Benchmark Suite from Internet Services.

Aibench: an industry standard ai benchmark suite

AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking

AI Benchmark: Running Deep Neural Networks on Android Smartphones

Benchmarking of DL Libraries and Models on Mobile Devices

A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices

Benchmarking State-of-the-Art Deep Learning Software Tools

Rethinking Mobile AI Ecosystem in the LLM Era

Performance Analysis and Characterization of Training Deep Learning Models on Mobile Devices

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases

AIBench Training: Balanced Industry-Standard AI Training Benchmarking

BENCHIP： Benchmarking Intelligence Processors

Deep Learning on Mobile and Embedded Devices

EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Benchmarking Object Detection Deep Learning Models in Embedded Devices