Variable selection for estimating individual tree height using genetic algorithm and random forest

Evandro Nunes Miranda,Bruno Henrique Groenner Barbosa,Sergio Henrique Godinho Silva,Cassio Augusto Ussi Monti,David Yue Phin Tng,Lucas Rezende Gomide
DOI: https://doi.org/10.1016/j.foreco.2021.119828
IF: 3.7
2022-01-01
Forest Ecology and Management
Abstract:Tree height is an important trait in forest science and is highly associated with the site quality from which the trees are measured. However, other factors, such as competition and species interaction, may yield better estimates for individual tree height when taken into account, but these variables have so far been challenging in model fitting. We propose a hybrid approach using genetic algorithms for variables selection and a machine learning algorithm (random forest) for fitting models of individual tree heights. We compare our proposed hybrid method with a mixed-effects model and random forest model using a dataset of 5,608 trees and 189 environmental variables (forest inventory-based variables, soil, topographic, climate, spectral, and geographic) from sites in southeastern Brazil. The tree height models were evaluated using the coefficient of determination, absolute bias, and root means square error (RMSE) based on the validation of dataset performance. The optimal set of variables of the proposed method include the ratio of diameter at breast height to quadratic mean diameter, distance independent competition index, dominant height, the soil silt and boron content. Our findings showed that the proposed hybrid method achieved an accuracy comparable with other methodologies in estimating the total height of the individual trees, and such a modelling approach could have broader applications in forestry and ecological science where a studied response trait has a large number of potential explanatory variables.
forestry
What problem does this paper attempt to address?