Semantic Segmentation of Land Cover in Urban Areas by Fusing Multisource Satellite Image Time Series

Jining Yan,Jingwei Liu,Dong Liang,Yi Wang,Jun Li,Lizhe Wang
DOI: https://doi.org/10.1109/tgrs.2023.3329709
IF: 8.2
2023-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Due to the complex and highly heterogeneous land cover in urban areas, the single-temporal pixel-wise and parcel-wise classification cannot realize high-precision recognition of ground objects. Semantic segmentation of satellite image time series (SITS), can distinguish objects with similar spectral reflection and temporal evolution. However, optical SITS have problems of uneven time–frequency distribution and incomplete, which makes it impossible to directly use existing models to carry out time-series semantic segmentation. This study proposes a semantic segmentation network that combines optical and radar SITS, named multisource temporal attention (TA) fusion-based temporal–spatial transformer (MTAF-TST), to achieve high-precision land cover classification in urban areas. First, MTAF-TST uses the transformer spatial semantic segmentation module to extract the spatial context information of ground objects to realize pixel-level land cover classification, which relieves the salt-and-pepper phenomenon that is easy to occur in traditional pixel-by-pixel classification in complex scenes. Second, MTAF-TST uses the transformer time feature extraction module to mine long-range time-dependent and high-level semantic information, overcoming the drawbacks of traditional convolutional and recurrent neural networks that cannot mine long-range time-dependent features of SITS. Finally, MTAF-TST uses a multisource TA fusion module to fuse the depth features of optical and radar SITS, which overcomes the shortcomings of traditional direct feature stitching methods that cannot make full use of time-correlated features, achieving high-precision land cover classification. The experimental results show that the MTAF-TST can realize the complementarity of radar and optical SITS in terms of timing integrity, color, and texture, and effectively improve the accuracy of SITS classification.
What problem does this paper attempt to address?