Quality Time: Carbon-Aware Quality Adaptation for Energy-Intensive Services
Philipp Wiesner,Dennis Grinwald,Philipp Weiß,Patrick Wilhelm,Ramin Khalili,Odej Kao
2024-11-28
Abstract:The energy demand of modern cloud services, particularly those related to generative AI, is increasing at an unprecedented pace. While hyperscalers are collectively failing to meet their self-imposed emission reduction targets, they face increasing pressure from environmental sustainability reporting across many jurisdictions. To date, carbon-aware computing strategies have primarily focused on batch process scheduling or geo-distributed load balancing. However, such approaches are not applicable to services that require constant availability at specific locations, due to latency, privacy, data, or infrastructure constraints.
In this paper, we explore how the carbon footprint of energy-intensive services can be reduced, by adjusting the fraction of requests served by different service quality tiers. We show, that by adapting the the quality of responses with respect to local carbon intensity, we can achieve additional carbon savings beyond resource and energy efficiency. Building on this, we introduce a multi-horizon optimization, that reaches close-to-optimal carbon savings under realistic conditions, and can dynamically adapt the service quality for best-effort users to stay within an annual carbon budget. Our approach can reduce the emissions of large-scale LLM services, which we estimate at multiple 10,000 tons of CO$_2$ annually, by up to 10%.
Distributed, Parallel, and Cluster Computing,Systems and Control