MFPANet: Multi-Scale Feature Perception and Aggregation Network for High-Resolution Snow Depth Estimation

Liling Zhao,Junyu Chen,Muhammad Shahzad,Min Xia,Haifeng Lin
DOI: https://doi.org/10.3390/rs16122087
IF: 5
2024-06-10
Remote Sensing
Abstract:Accurate snow depth estimation is of significant importance, particularly for preventing avalanche disasters and predicting flood seasons. The predominant approaches for such snow depth estimation, based on deep learning methods, typically rely on passive microwave remote sensing data. However, due to the low resolution of passive microwave remote sensing data, it often results in low-accuracy outcomes, posing considerable limitations in application. To further improve the accuracy of snow depth estimation, in this paper, we used active microwave remote sensing data. We fused multi-spectral optical satellite images, synthetic aperture radar (SAR) images and land cover distribution images to generate a snow remote sensing dataset (SRSD). It is a first-of-its-kind dataset that includes active microwave remote sensing images in high-latitude regions of Asia. Using these novel data, we proposed a multi-scale feature perception and aggregation neural network (MFPANet) that focuses on improving feature extraction from multi-source images. Our systematic analysis reveals that the proposed approach is not only robust but also achieves high accuracy in snow depth estimation compared to existing state-of-the-art methods, with RMSE of 0.360 and with MAE of 0.128. Finally, we selected several representative areas in our study region and applied our method to map snow depth distribution, demonstrating its broad application prospects.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the accuracy of high - resolution snow depth estimation. Specifically, the existing snow depth estimation methods based on deep learning mainly rely on passive microwave remote sensing data. However, due to the low spatial resolution of passive microwave remote sensing data, the accuracy of the estimation results is poor, and there are great limitations in application. To solve this problem, this paper uses active microwave remote sensing data and fuses multispectral optical satellite images, synthetic aperture radar (SAR) images and land cover distribution images to construct a snow depth remote sensing data set (SRSD). On this basis, a multi - scale feature perception and aggregation neural network (MFPANet) is proposed, aiming to improve the ability to extract features from multi - source images, so as to achieve higher - precision snow depth estimation. ### Main contributions of the paper 1. **Constructing a multi - source data set**: This paper constructs a data set for snow cover remote sensing in high - latitude areas of Asia. This data set fuses multispectral optical satellite images, SAR images and land cover distribution images, and uses the surface snow depth measurements at meteorological stations as the ground truth. 2. **Proposing a multi - scale neural network**: Different from the "point - to - point" prediction that ignores spatial characteristics, the model proposed in this paper is a "region - to - point" deep model for snow depth estimation. The network includes a multi - branch feature extraction unit (MBFE), a multi - scale feature atrous aggregation module (MSFAA) and a high - and low - level feature fusion module (HLF). These components endow the new model with multi - scale feature perception ability, which is especially helpful to reduce the spatial interference in non - snow areas, so as to achieve high - precision snow depth estimation. 3. **Drawing snow depth distribution maps**: By optimizing the model parameters, a snow depth distribution map with a resolution of 320 meters is generated in the study area. Based on this method, it can be predicted that high - resolution snow depth maps will be generated in any area of interest. ### Method overview 1. **Multi - branch feature extraction unit (MBFE)**: A dual - branch residual network structure is used to extract and fuse features at different levels. The down - sampling order of each branch is different, and the feature complementarity is enhanced through the concatenation operation. 2. **Multi - scale feature atrous aggregation module (MSFAA)**: It contains multiple depth - wise separable convolutions with different dilation rates, as well as 1x1 convolution and 2x2 average pooling operations. Features are aggregated through the pyramid pooling concept to reduce the interference of non - snow factors. 3. **High - and low - level feature fusion module (HLF)**: Two branches are designed to input high - level semantic information and low - level feature information respectively. High - and low - level features are fused through weighting operations and up - sampling to avoid information distortion and loss of diversity caused by simple combination. ### Experimental setup The experiment uses the PyTorch framework and is configured with an Nvidia GeForce RTX4070Ti GPU. The optimizer adopts the adaptive moment estimation (Adam), and the learning rate strategy is the "ploy" strategy. The specific formula is as follows: \[ \text{lr} = \text{base\_lr} \times \left(1 - \frac{\text{epoch}}{\text{num\_epoch}}\right)^{\text{power}} \] where \(\text{lr}\) is the updated learning rate, \(\text{base\_lr}\) is the baseline learning rate, \(\text{epoch}\) is the number of iterations, \(\text{num\_epoch}\) is the maximum number of iterations, and \(\text{power}\) controls the shape of the curve (usually greater than 1). In the model of this paper, \(\text{power}\) is set to 0.9, and the final fixed number of iterations is 250. No pre - trained parameters are used in the training process, and the batch size is set to 64. To prevent the model from over - fitting, multiple methods such as data augmentation, dropout and normalization are adopted. In addition, a ten - fold cross - validation experiment is also designed, and a part of the data is selected as the test set in turn.