A Water Efficiency Dataset for African Data Centers

Noah Shumba,Opelo Tshekiso,Pengfei Li,Giulia Fanti,Shaolei Ren
2024-12-05
Abstract:AI computing and data centers consume a large amount of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to developed countries such as the U.S., this paper presents the first-of-its-kind dataset that combines nation-level weather and electricity generation data to estimate water usage efficiency for data centers in 41 African countries across five different climate regions. We also use our dataset to evaluate and estimate the water consumption of inference on two large language models (i.e., Llama-3-70B and GPT-4) in 11 selected African countries. Our findings show that writing a 10-page report using Llama-3-70B could consume about \textbf{0.7 liters} of water, while the water consumption by GPT-4 for the same task may go up to about 60 liters. For writing a medium-length email of 120-200 words, Llama-3-70B and GPT-4 could consume about \textbf{0.13 liters} and 3 liters of water, respectively. Interestingly, given the same AI model, 8 out of the 11 selected African countries consume less water than the global average, mainly because of lower water intensities for electricity generation. However, water consumption can be substantially higher in some African countries with a steppe climate than the U.S. and global averages, prompting more attention when deploying AI computing in these countries. Our dataset is publicly available on \href{<a class="link-external link-https" href="https://huggingface.co/datasets/masterlion/WaterEfficientDatasetForAfricanCountries/tree/main" rel="external noopener nofollow">this https URL</a>}{Hugging Face}.
Machine Learning,Computers and Society
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the water consumption problem faced by African data centers during their rapid expansion. Specifically, the goals of the paper include: 1. **Fill the research gap**: Although existing research has focused on water consumption in data centers in developed regions such as the United States and Europe, there has been little research on data center construction in Africa and its impact on water resources. Many African countries have already faced long - term drought and water shortage problems, so it is crucial to evaluate the water consumption of data centers in these countries. 2. **Create the first African data center water - use - efficiency dataset**: The paper constructs a dataset covering 41 African countries and five different climate zones, combined with national - level weather and power production data, to estimate the direct and indirect water consumption (Water Usage Efficiency, WUE) of data centers. This dataset not only helps to understand the water consumption situation of African data centers but also provides a valuable resource for future related research. 3. **Evaluate the water consumption of AI model inference tasks**: Using the above - mentioned dataset, the paper further evaluates the water consumption of two large - language models (Llama - 3 - 70B and GPT - 4) when performing inference tasks in 11 selected African countries. Research shows that to generate a 10 - page report, Llama - 3 - 70B consumes approximately 0.7 liters of water, while GPT - 4 may consume up to 60 liters of water; to write a medium - length email (120 - 200 words), Llama - 3 - 70B and GPT - 4 consume approximately 0.13 liters and 3 liters of water respectively. 4. **Reveal regional differences**: The study found that in 7 out of 8 selected African countries, the water consumption is lower than the global average level, mainly because the water intensity of their power production is low. However, in some countries with steppe climate, the water consumption is even higher than that in the United States and the global average level, which indicates that more caution should be exercised when deploying AI computing services in these areas. 5. **Provide policy recommendations**: By analyzing the differences in water consumption in different climate zones, the paper emphasizes the importance of adjusting cooling systems according to different climate conditions and the necessity of reducing dependence on high - water - consumption energy. These recommendations are helpful to promote the sustainable development of the African data center industry and ensure the rational use of limited water resources while meeting the needs of economic growth. ### Formula summary To calculate the water consumption of data centers, the paper uses the following formulas: - **On - site water consumption**: \[ W_{\text{on}}=\gamma_{\text{on}}\cdot E \] where \(W_{\text{on}}\) is the on - site water consumption, \(\gamma_{\text{on}}\) is the on - site WUE, and \(E\) is the server energy consumption. - **Off - site water consumption**: \[ W_{\text{off}}=\gamma_{\text{off}}\cdot\rho\cdot E \] where \(W_{\text{off}}\) is the off - site water consumption, \(\gamma_{\text{off}}\) is the off - site WUE, \(\rho\) is the power usage effectiveness (PUE), and \(E\) is the server energy consumption. The application of these formulas helps researchers to more accurately evaluate the total water consumption of data centers and provides a scientific basis for subsequent optimization.