Coupling video vision transformer (ViVit) into land change simulation: a comparison with three-dimensional convolutional neural network (3DCNN)

Haiyang Li,Liang Fan,Yifan Gao,Zhao Liu,Peichao Gao
DOI: https://doi.org/10.1080/14498596.2024.2312506
2024-02-28
Journal of Spatial Science
Abstract:To enhance land use/cover change (LUCC) simulation accuracy, we introduced ViViT-ANN-CA, blending video vision transformer's spatio-temporal features extraction ability, artificial neural network's (ANN) non-linearity computing ability, and CA's spatial computing. Compared to 3DCNN-ANN-CA, ViViT-ANN-CA showed higher accuracy in simulating water bodies and vegetation, with overall improvements in Hailing District and Wuxi City. ViViT demonstrates comparable spatio-temporal feature extraction ability to three-dimensional convolutional neural network (3DCNN), promising for future ynamic LUCC simulations.
geography, physical,remote sensing
What problem does this paper attempt to address?