AGBD: A Global-scale Biomass Dataset

Ghjulia Sialelli,Torben Peters,Jan D. Wegner,Konrad Schindler
2024-06-07
Abstract:Accurate estimates of Above Ground Biomass (AGB) are essential in addressing two of humanity's biggest challenges, climate change and biodiversity loss. Existing datasets for AGB estimation from satellite imagery are limited. Either they focus on specific, local regions at high resolution, or they offer global coverage at low resolution. There is a need for a machine learning-ready, globally representative, high-resolution benchmark. Our findings indicate significant variability in biomass estimates across different vegetation types, emphasizing the necessity for a dataset that accurately captures global diversity. To address these gaps, we introduce a comprehensive new dataset that is globally distributed, covers a range of vegetation types, and spans several years. This dataset combines AGB reference data from the GEDI mission with data from Sentinel-2 and PALSAR-2 imagery. Additionally, it includes pre-processed high-level features such as a dense canopy height map, an elevation map, and a land-cover classification map. We also produce a dense, high-resolution (10m) map of AGB predictions for the entire area covered by the dataset. Rigorously tested, our dataset is accompanied by several benchmark models and is publicly available. It can be easily accessed using a single line of code, offering a solid basis for efforts towards global AGB estimation. The GitHub repository <a class="link-external link-http" href="http://github.com/ghjuliasialelli/AGBD" rel="external noopener nofollow">this http URL</a> serves as a one-stop shop for all code and data.
Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Lack of high-resolution global biomass datasets**: Existing Aboveground Biomass (AGB) estimation datasets are either limited to specific regions with high resolution or provide global coverage but with low resolution. Therefore, there is a need for a machine learning-ready, globally representative, and high-resolution dataset. 2. **Limitations of existing datasets**: Current datasets are either too localized to be generalized globally or have low resolution, failing to capture detailed biomass information. The dataset proposed in this paper covers various ecosystems and is presented in high resolution, aiding in training more performant and generalizable biomass estimation models. 3. **Improving the accuracy of regional studies**: By combining GEDI data with local reference data, the accuracy and performance of models in specific regions can be enhanced. Researchers can use this comprehensive dataset for initial training and then fine-tune with local data to improve the accuracy and performance of their analyses. 4. **Filling gaps in the literature**: There is currently a lack of a robust and diverse global dataset that supports high-resolution biomass mapping. The dataset provided in this paper has a nominal resolution of 10 meters, represents global land cover and vegetation distribution, and integrates multiple data sources. In summary, the main objective of this paper is to provide a machine learning-ready, easily accessible, and globally representative dataset to facilitate high-resolution biomass estimation research.