Abstract:Deep learning models have been used to support analytics beyond simple aggregation, where deeper and wider models have been shown to yield great results. These models consume a huge amount of memory and computational operations. However, most of the large-scale industrial applications are often computational budget constrained. In practice, the peak workload of inference service could be 10x higher than the average cases, with the presence of unpredictable extreme cases. Lots of computational resources could be wasted during off-peak hours and the system may crash when the workload exceeds system capacity. How to support deep learning services with dynamic workload cost-efficiently remains a challenging problem. In this paper, we address the challenge with a general and novel training scheme called model slicing , which enables deep learning models to provide predictions within the prescribed computational resource budget dynamically. Model slicing could be viewed as an elastic computation solution without requiring more computational resources. Succinctly, each layer in the model is divided into groups of contiguous block of basic components (i.e. neurons in dense layers and channels in convolutional layers), and then partially ordered relation is introduced to these groups by enforcing that groups participated in each forward pass always starts from the first group to the dynamically-determined rightmost group. Trained by dynamically indexing the rightmost group with a single parameter slice rate , the network is engendered to build up group-wise and residual representation. Then during inference, a sub-model with fewer groups can be readily deployed for efficiency whose computation is roughly quadratic to the width controlled by the slice rate. Extensive experiments show that models trained with model slicing can effectively support on-demand workload with elastic inference cost.

A Learned Cost Model for Big Data Query Processing

Database Query Cost Prediction Using Recurrent Neural Network

Learning-based SPARQL Query Performance Modeling and Prediction

Forecasting SQL Query Cost at Twitter

Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload

DACE: A Database-Agnostic Cost Estimator

Neural-based Modeling for Performance Tuning of Spark Data Analytics

Rethinking Learned Cost Models: Why Start from Scratch?

The Optimization of Cost-Model for Join Operator on Spark SQL Platform

An End-to-End Learning-based Cost Estimator

A Gray-Box Performance Model for Apache Spark

Scalable Relational Query Processing on Big Matrix Data

Power Load Prediction Model Based on Long Short Term Memory and Sparrow Search Algorithm

Learning to Optimize Join Queries With Deep Reinforcement Learning

Efficient Learning with Pseudo Labels for Query Cost Estimation

COOOL: A Learning-To-Rank Approach for SQL Hint Recommendations

Database Native Approximate Query Processing Based on Machine-Learning

Plan-Structured Deep Neural Network Models for Query Performance Prediction

A Deep Neural Network Based Approach to Building Budget-Constrained Models for Big Data Analysis

Cost-based or Learning-based? A Hybrid Query Optimizer for Query Plan Selection

Model Slicing for Supporting Complex Analytics with Elastic Inference Cost and Resource Constraints