Abstract:DNN-based video analytics have empowered many new applications (e.g., automated retail). Meanwhile, the proliferation of fog devices provides developers with more design options to improve performance and save cost. To the best of our knowledge, this paper presents the first serverless system that takes full advantage of the client-fog-cloud synergy to better serve the DNN-based video analytics. Specifically, the system aims to achieve two goals: 1) Provide the optimal analytics results under the constraints of lower bandwidth usage and shorter round-trip time (RTT) by judiciously managing the computational and bandwidth resources deployed in the client, fog, and cloud environment. 2) Free developers from tedious administration and operation tasks, including DNN deployment, cloud and fog's resource management. To this end, we implement a holistic cloud-fog system referred to as VPaaS (Video-Platform-as-a-Service). VPaaS adopts serverless computing to enable developers to build a video analytics pipeline by simply programming a set of functions (e.g., model inference), which are then orchestrated to process videos through carefully designed modules. To save bandwidth and reduce RTT, VPaaS provides a new video streaming protocol that only sends low-quality video to the cloud. The state-of-the-art (SOTA) DNNs deployed at the cloud can identify regions of video frames that need further processing at the fog ends. At the fog ends, misidentified labels in these regions can be corrected using a light-weight DNN model. To address the data drift issues, we incorporate limited human feedback into the system to verify the results and adopt incremental learning to improve our system continuously. The evaluation demonstrates that VPaaS is superior to several SOTA systems: it maintains high accuracy while reducing bandwidth usage by up to 21%, RTT by up to 62.5%, and cloud monetary cost by up to 50%.

Λ DNN : Achieving Predictable Distributed DNN Training with Serverless Architectures

Design and implementation of efficient distributed deep learning model inference architecture on serverless computation

A Deep Reinforcement Learning based Algorithm for Time and Cost Optimized Scaling of Serverless Applications

MLLess: Achieving Cost Efficiency in Serverless Machine Learning Training

Functions as a service for distributed deep neural network inference over the cloud‐to‐things continuum

FedLess: Secure and Scalable Federated Learning Using Serverless Computing

ServerlessLLM: Low-Latency Serverless Inference for Large Language Models

Decentralized Proactive Model Offloading and Resource Allocation for Split and Federated Learning

Distributed Double Machine Learning with a Serverless Architecture

ElasticFlow: an Elastic Serverless Training Platform for Distributed Deep Learning.

EsDNN: Deep Neural Network based Multivariate Workload Prediction Approach in Cloud Environment

HarmonyBatch: Batching multi-SLO DNN Inference with Heterogeneous Serverless Functions

Prediction-driven resource provisioning for serverless container runtimes

Evaluating Serverless Machine Learning Performance on Google Cloud Run

A Serverless Cloud-Fog Platform for DNN-Based Video Analytics with Incremental Learning

MOPAR: A Model Partitioning Framework for Deep Learning Inference Services on Serverless Platforms

Rationing Bandwidth Resources for Mitigating Network Resource Contention in Distributed DNN Training Clusters.

Incorporating Serverless Computing into P2P Networks for ML Training: In-Database Tasks and Their Scalability Implications (Student Abstract)

Cloudless-Training: A Framework to Improve Efficiency of Geo-Distributed ML Training

FSD-Inference: Fully Serverless Distributed Inference with Scalable Cloud Communication

Privacy-Preserving Computation Offloading for Parallel Deep Neural Networks Training