Identifying architectural design decisions for achieving green ML serving

Francisco Durán, Silverio Martínez-Fernández, Matias Martinez, Patricia Lago

DOI: https://doi.org/10.1145/3644815.3644962

2024-02-13

Abstract:The growing use of large machine learning models highlights concerns about their increasing computational demands. While the energy consumption of their training phase has received attention, fewer works have considered the inference phase. For ML inference, the binding of ML models to the ML system for user access, known as ML serving, is a critical yet understudied step for achieving efficiency in ML applications. We examine the literature in ML architectural design decisions and Green AI, with a special focus on ML serving. The aim is to analyze ML serving architectural design decisions for the purpose of understanding and identifying them with respect to quality characteristics from the point of view of researchers and practitioners in the context of ML serving literature. Our results (i) identify ML serving architectural design decisions along with their corresponding components and associated technological stack, and (ii) provide an overview of the quality characteristics studied in the literature, including energy efficiency. This preliminary study is the first step in our goal to achieve green ML serving. Our analysis may aid ML researchers and practitioners in making green-aware architecture design decisions when serving their models.

Machine Learning,Software Engineering

What problem does this paper attempt to address?

The problem discussed in this paper is how to implement green design decisions in machine learning (ML) services to reduce environmental impact. The research focuses on the inference stage after training the model, where the model is bound to the application for user access, a process referred to as ML services. Currently, while energy consumption during the training phase has received attention, there is relatively less research on efficiency during the inference stage. The paper performs a literature analysis on the architecture design decisions of ML services, with a focus on green artificial intelligence and ML services. The objective is to identify decision-makers related to quality characteristics and understand their importance in the ML services literature from the perspectives of researchers and practitioners. The research findings list major design decisions for ML services, including runtime engine, runtime engine without execution, software specific to deep learning (DL), and end-to-end ML cloud services, and discuss the quality characteristics of these decisions, such as energy efficiency. The study found that performance efficiency is the most commonly considered quality characteristic, while research on energy efficiency is relatively scarce. Furthermore, some cross-domain decisions were identified, such as containerization, model formats, request processing, and communication protocols, which may be related to service infrastructure and have an impact on service efficiency and sustainability. The paper emphasizes the need for more in-depth research on the impact of different design decisions on the energy consumption of ML services to promote the development of green ML services. This will help researchers and practitioners make more environmentally friendly architecture decisions when deploying models.

Identifying architectural design decisions for achieving green ML serving

A Synthesis of Green Architectural Tactics for ML-Enabled Systems

Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference

Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AI

Which design decisions in AI-enabled mobile applications contribute to greener AI?

Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference

A Review for Green Energy Machine Learning and AI Services

How to use model architecture and training environment to estimate the energy consumption of DL training

Impact of ML Optimization Tactics on Greener Pre-Trained ML Models

Power Hungry Processing: Watts Driving the Cost of AI Deployment?

Integrating machine and deep learning technologies in green buildings for enhanced energy efficiency and environmental sustainability

Towards green AI-based software systems: an architecture-centric approach (GAISSA)

ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design

GAISSALabel: A tool for energy labeling of ML models

Carbon Emissions and Large Neural Network Training

Advancing Energy Performance Efficiency in Residential Buildings for Sustainable Design: Integrating Machine Learning and Optimized Explainable AI (AIX)

Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service

A Survey of Machine Learning for Computer Architecture and Systems

MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from μWatts to MWatts for Sustainable AI

Towards Green Automated Machine Learning: Status Quo and Future Directions

A systematic review of Green AI