Automated tree-crown and height detection in a young forest plantation using mask region-based convolutional neural network (Mask R-CNN)

Zhenbang Hao,Lili Lin,Christopher J. Post,Elena A. Mikhailova,Minghui Li,Yan Chen,Kunyong Yu,Jian Liu
DOI: https://doi.org/10.1016/j.isprsjprs.2021.06.003
IF: 12.7
2021-01-01
ISPRS Journal of Photogrammetry and Remote Sensing
Abstract:Tree-crown and height are primary tree measurements in forest inventory. Convolutional neural networks (CNNs) are a class of neural networks, which can be used in forest inventory; however, no prior studies have developed a CNN model to detect tree crown and height simultaneously. This study is the first-of-its-kind that explored training a mask region-based convolutional neural network (Mask R-CNN) for automatically and concurrently detecting discontinuous tree crown and height of Chinese fir (Cunninghamia lanceolata (Lamb) Hook) in a plantation. A DJI Phantom4-Multispectral Unmanned Aerial Vehicle (UAV) was used to obtain high-resolution images of the study site, Shunchang County, China. Tree crown and height of Chinese fir was manually delineated and derived from this UAV imagery. A portion of the ground-truthed tree height values were used as a test set, and the remaining measurements were used as the model training data. Six different band combinations and derivations of the UAV imagery were used to detect tree crown and height, respectively (Multi band-DSM, RGB-DSM, NDVI-DSM, Multi band-CHM, RGB-CHM, and NDVI-CHM combination). The Mask R-CNN model with the NDVI-CHM combination achieved superior performance. The accuracy of Chinese fir's individual tree-crown detection was considerable (F1 score = 84.68%), the Intersection over Union (IoU) of tree crown delineation was 91.27%, and tree height estimates were highly correlated with the height from UAV imagery (R-2 = 0.97, RMSE = 0.11 m, rRMSE = 4.35%) and field measurement (R-2 = 0.87, RMSE = 0.24 m, rRMSE = 9.67%). Results demonstrate that the input image with an CHM achieves higher accuracy of tree crown delineation and tree height assessment compared to an image with a DSM. The accuracy and efficiency of Mask R-CNN has a great potential to assist the application of remote sensing in forests.
What problem does this paper attempt to address?