Tinker: A Middleware for Deploying Multiple NN-Based Applications on a Single Machine

Chao Wang,Lihui Jin,Lei Gong,Chongchong Xu,Yahui Hu,Luchao Tan,Xuehai Zhou
DOI: https://doi.org/10.1109/tcad.2020.3019981
IF: 2.9
2021-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:Currently, deep learning technology is widely used in various fields, such as face recognition, object recognition, and image classification. Multiple application instances sharing GPU resources can deploy more applications with limited GPU resources. However, this will lead to resource competition problems, resulting in application switching, timeouts, and other phenomena. Therefore, how to deploy these applications to a single machine with limited resources and properly schedule tasks while maximizing system performance is a new challenge. In this article, we propose Tinker, a middleware, to solve multiple CNN-based application deployment problems on a single machine. Tinker has two phases: 1) offline analysis and 2) runtime scheduling. Offline analysis generates the best application deployment configuration information of the current system. Runtime scheduling can properly schedule tasks to ensure that they are completed normally and efficiently utilize resources to improve system performance. Our experiment proved that Tinker boosts system performance and ensures that most tasks are completed efficiently.
What problem does this paper attempt to address?