High-Resolution Canopy Height Mapping: Integrating NASA's Global Ecosystem Dynamics Investigation (GEDI) with Multi-Source Remote Sensing Data

Cesar Alvites,Hannah O'Sullivan,Saverio Francini,Marco Marchetti,Giovanni Santopuoli,Gherardo Chirici,Bruno Lasserre,Michela Marignani,Erika Bazzato
DOI: https://doi.org/10.3390/rs16071281
IF: 5
2024-04-05
Remote Sensing
Abstract:Accurate structural information about forests, including canopy heights and diameters, is crucial for quantifying tree volume, biomass, and carbon stocks, enabling effective forest ecosystem management, particularly in response to changing environmental conditions. Since late 2018, NASA's Global Ecosystem Dynamics Investigation (GEDI) mission has monitored global canopy structure using a satellite Light Detection and Ranging (LiDAR) instrument. While GEDI has collected billions of LiDAR shots across a near-global range (between 51.6°N and >51.6°S), their spatial distribution remains dispersed, posing challenges for achieving complete forest coverage. This study proposes and evaluates an approach that generates high-resolution canopy height maps by integrating GEDI data with Sentinel-1, Sentinel-2, and topographical ancillary data through three machine learning (ML) algorithms: random forests (RF), gradient tree boost (GB), and classification and regression trees (CART). To achieve this, the secondary aims included the following: (1) to assess the performance of three ML algorithms, RF, GB, and CART, in predicting canopy heights, (2) to evaluate the performance of our canopy height maps using reference canopy height from canopy height models (CHMs), and (3) to compare our canopy height maps with other two existing canopy height maps. RF and GB were the top-performing algorithms, achieving the best 13.32% and 16% root mean squared error for broadleaf and coniferous forests, respectively. Validation of the proposed approach revealed that the 100th and 98th percentile, followed by the average of the 75th, 90th, 95th, and 100th percentiles (AVG), were the most accurate GEDI metrics for predicting real canopy heights. Comparisons between predicted and reference CHMs demonstrated accurate predictions for coniferous stands (R-squared = 0.45, RMSE = 29.16%).
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The main problem this paper attempts to address is improving the resolution and coverage of forest canopy height maps. Specifically, the paper proposes a method to generate high-resolution canopy height maps by integrating NASA's Global Ecosystem Dynamics Investigation (GEDI) data with multi-source remote sensing data (such as Sentinel-1, Sentinel-2, and terrain auxiliary data). This method aims to overcome the limitations of GEDI data in terms of spatial distribution discontinuity and large pixel size, thereby achieving more accurate monitoring and management of forest structure. ### Main Research Objectives: 1. **Evaluate the performance of three machine learning algorithms (Random Forest RF, Gradient Boosting GB, and Classification and Regression Trees CART) in predicting canopy height**. 2. **Assess the performance of the generated canopy height maps using reference canopy height models (CHMs)**. 3. **Compare the generated canopy height maps with two existing canopy height maps**. ### Background and Motivation: - **Importance of Forests**: Forests play a crucial role in regulating the carbon-water cycle, supporting biodiversity, and providing economic benefits. - **Threats from Environmental Changes**: Environmental changes caused by human activities pose threats to the stability of forests and ecosystem services. - **Application of Remote Sensing Technology**: Remote sensing technology, especially LiDAR, is widely used to assess forest structure, including canopy height, which is essential for estimating above-ground biomass and carbon storage. - **Limitations of GEDI Data**: Although GEDI data provides global canopy structure information, its large spatial resolution and discontinuous distribution limit its application in local forest studies. ### Solution: - **Data Integration**: Combine GEDI data, Sentinel-1 and Sentinel-2 satellite imagery, and terrain data. - **Machine Learning Algorithms**: Use Random Forest (RF), Gradient Boosting (GB), and Classification and Regression Trees (CART) algorithms for canopy height prediction. - **Performance Evaluation**: Assess the accuracy of the generated maps by comparing them with reference ALS data and existing canopy height maps. ### Study Sites: - Two Mediterranean forest test sites: Pennataro (oak-beech forest) and Lago di Occhito (Mediterranean pine forest), representing two different structural types of European forest ecosystems. ### Data Sources: 1. **ALS Data**: Used for canopy height validation. 2. **GEDI Relative Height Metrics**: Used for downscaling. 3. **Sentinel-1 and Sentinel-2 Multispectral Satellite Imagery**: Used for canopy height prediction. 4. **Terrain Features**: Such as elevation, slope, and aspect, as additional predictor variables. 5. **Existing GEDI-derived Canopy Height Maps**: Used for further validation. ### Methods: - **Feature Selection and Model Building**: Use 18 predictor variables, including multispectral Sentinel-1 and Sentinel-2 imagery and terrain data. - **Training and Testing Data Division**: Adopt a random sampling strategy, dividing the data into a training set (70%) and a testing set (30%). - **Machine Learning Algorithms**: Use RF, GB, and CART algorithms to generate canopy height maps. - **Performance Evaluation**: Evaluate the performance of the models and maps using the coefficient of determination (R-squared) and root mean square error (RMSE). Through these methods, the paper aims to generate high-resolution canopy height maps to better support forest management and ecosystem conservation.