Abstract:Recent years have seen a surge in the popularity of commercial AI products based on generative, multi-purpose AI systems promising a unified approach to building machine learning (ML) models into technology. However, this ambition of ``generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit. In this work, we propose the first systematic comparison of the ongoing inference cost of various categories of ML systems, covering both task-specific (i.e. finetuned models that carry out a single task) and `general-purpose' models, (i.e. those trained for multiple tasks). We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models. We find that multi-purpose, generative architectures are orders of magnitude more expensive than task-specific systems for a variety of tasks, even when controlling for the number of model parameters. We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions. All the data from our study can be accessed via an interactive demo to carry out further exploration and analysis.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to explore and compare the energy consumption and carbon emission costs of different types of machine learning (ML) systems during the inference phase. Specifically, the researchers focus on the energy consumption differences between task-specific models (i.e., models fine-tuned for a single task) and general-purpose models (i.e., models capable of handling multiple tasks). The main objectives of the paper include: 1. **Quantifying the energy consumption of different models**: By measuring the energy and carbon emissions required for 1,000 inferences, the study evaluates the energy consumption of different types of ML models during deployment. 2. **Comparing task-specific models and general-purpose models**: Analyzing the energy consumption differences between task-specific models and general-purpose models across different tasks, especially multimodal tasks (such as image generation and text generation). 3. **Revealing the relationship between energy consumption and model structure**: Investigating the impact of model size, task type, and modality on energy consumption to understand the environmental impact of different models in practical applications. ### Main Findings 1. **Task-Specific Models vs. General-Purpose Models**: - General-purpose models have significantly higher energy consumption during the inference phase compared to task-specific models, even when controlling for the number of model parameters. - For example, small models in image generation tasks (such as segmind/tiny-sd) produce much higher carbon emissions compared to models in text classification tasks (100 grams vs. 0.6 grams). 2. **Impact of Task Type**: - Classification tasks (such as image classification and text classification) have lower energy consumption, while generation tasks (such as text generation and image generation) have higher energy consumption. - Multimodal tasks (such as image generation) have the highest energy consumption, with average energy consumption being more than 60 times higher than text generation tasks. 3. **Impact of Model Size**: - There is a certain relationship between model size and energy consumption, but the task structure has a greater impact on energy consumption. - For example, even image generation models with fewer parameters have much higher energy consumption compared to text classification models with more parameters. ### Conclusion The paper emphasizes the need to balance the environmental costs when deploying general-purpose models. Although general-purpose models may be more energy-efficient during the training phase, their energy consumption and carbon emissions during large-scale deployment can significantly increase. Therefore, the researchers suggest that more careful consideration of the environmental impact is needed when choosing models and propose further research to optimize model design and deployment strategies to reduce their negative impact on the environment.

Power Hungry Processing: Watts Driving the Cost of AI Deployment?

MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from μWatts to MWatts for Sustainable AI

Beyond Efficiency: Scaling AI Sustainably

Carbon Emissions and Large Neural Network Training

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

The Unseen AI Disruptions for Power Grids: LLM-Induced Transients

Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

The Energy Cost of Artificial Intelligence of Things Lifecycle

Trends in Energy Estimates for Computing in AI/Machine Learning Accelerators, Supercomputers, and Compute-Intensive Applications

Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference

How Green Can AI Be? A Study of Trends in Machine Learning Environmental Impacts

Empirical Measurements of AI Training Power Demand on a GPU-Accelerated Node

Revisit the Environmental Impact of Artificial Intelligence: the Overlooked Carbon Emission Source?

Watts and Bots: The Energy Implications of AI Adoption

Exploring the horizon of AI development: Navigating constraints of chips and power in the technological landscape

Toward Cross-Layer Energy Optimizations in AI Systems

Data-Centric Green AI: An Exploratory Empirical Study

Measuring the Carbon Intensity of AI in Cloud Instances

The Unpaid Toll: Quantifying the Public Health Impact of AI

AI Tax: The Hidden Cost of AI Data Center Applications

Method and evaluations of the effective gain of artificial intelligence models for reducing CO2 emissions