Design of a Novel Neural Network Compression Method for Tiny Machine Learning

Yingling Li,Zhipeng Li,Tianxing Zhang,Peng Zhou,Siyin Feng,Kunqin Yin
DOI: https://doi.org/10.1145/3501409.3501526
2021-10-22
Abstract:Traditional IoT processing data is sent from local devices to the cloud for processing, which has disadvantages such as low privacy, high latency, and low energy efficiency. These drawbacks can be effectively remedied by deploying the model for processing on devices at the "edge" of the cloud. In order to realize the data being processed directly on the cloud "edge" devices, the original machine learning algorithms need to be improved. One of the important steps is neural network compression. In this paper, a neural network compression method for Tiny Machine Learning (TinyML) is proposed. The neural network is compressed by training a conventional neural network and then performing group convolution, pruning and asymmetric ternary quantization. In the next step, a model transformation is performed using TFLite to deploy it on embedded devices. With this novel neural network compression method, the size of the model can be greatly compressed with guaranteed accuracy. Consequently, the traditional machine learning is upgraded to TinyML, and finally a TinyML-based fall monitoring system for the elderly is built.
What problem does this paper attempt to address?