Abstract:UAV-view Geo-Localization~(UVGL) presents substantial challenges, particularly due to the disparity in visual appearance between drone-captured imagery and satellite perspectives. Existing methods usually assume consistent scaling factor across different views. Therefore, they adopt predefined partition alignment and extract viewpoint-invariant representation by constructing a variety of part-level features. However, the scaling assumption is not always hold in the real-world scenarios that variations of UAV flight state leads to the scale mismatch of cross-views, resulting in serious performance degradation. To overcome this issue, we propose a partition learning framework based on relative distance, which alleviates the dependence on scale consistency while mining fine-grained features. Specifically, we propose a distance guided dynamic partition learning strategy~(DGDPL), consisting of a square partition strategy and a distance-guided adjustment strategy. The former is utilized to extract fine-grained features and global features in a simple manner. The latter calculates the relative distance ratio between drone- and satellite-view to adjust the partition size, thereby explicitly aligning the semantic information between partition pairs. Furthermore, we propose a saliency-guided refinement strategy to refine part-level features, so as to further improve the retrieval accuracy. Extensive experiments show that our approach achieves superior geo-localization accuracy across various scale-inconsistent scenarios, and exhibits remarkable robustness against scale variations. The code will be released.

What problem does this paper attempt to address?

This paper attempts to solve the scale - change problem in UAV - view Geo - Localization (UVGL). Specifically, existing methods usually assume that there is a consistent scaling factor between different viewpoints and adopt a predefined partition - alignment strategy to extract view - point - invariant feature representations. However, in actual scenarios, the scale mismatch between cross - viewpoints due to the change of the UAV's flight state causes a significant performance degradation. To solve this problem, the author proposes a Distance Guided Dynamic Partition Learning (DGDPL) framework based on relative distance to reduce the dependence on scale consistency and mine fine - grained features. The DGDPL framework mainly includes the following parts: 1. **Square Partition Strategy (SPS)**: It is used to simply extract fine - grained features and global features. 2. **Distance - Guided Adjustment Strategy (DGAS)**: By calculating the relative distance ratio between the UAV view and the satellite view to adjust the partition size, thus explicitly aligning the semantic information between partition pairs. 3. **Saliency - Guided Refinement Strategy (SGRS)**: It is used to further optimize local features and improve retrieval accuracy. Through these strategies, the DGDPL framework can achieve superior geo - localization accuracy in various scale - inconsistent scenarios and shows significant robustness to scale changes. ### Formula Summary 1. **Relative Distance Ratio Formula**: \[ \beta=\frac{H_D - H_S}{H_S} \] where \(H_D\) and \(H_S\) represent the distances from the UAV view and the satellite view to the ground respectively. 2. **Adjustment Factor Formula**: \[ \theta = \text{int}(\beta\cdot\lambda\cdot\alpha) \] where \(\text{int}(\cdot)\) represents the rounding operation, \(\lambda\) is the up - sampling factor, and \(\alpha\) is a hyper - parameter that regulates the degree of partition change. 3. **Partition - Adjusted Shape Formula**: \[ f_{n_{SPS}}^D\in\mathbb{R}^{2048\times(32\lambda\frac{n_{SPS}}{N_{SPS}} - 2\theta)\times(32\lambda\frac{n_{SPS}}{N_{SPS}} - 2\theta)} \] \[ f_{n_{SPS}}^S\in\mathbb{R}^{2048\times(32\lambda\frac{n_{SPS}}{N_{SPS}})\times(32\lambda\frac{n_{SPS}}{N_{SPS}})} \] 4. **Saliency Heat Map Generation Formula**: \[ heat_{n_{SPS}}^i=\frac{N(M(f_{n_{SPS}}^i)) + CM_{n_{SPS}}^i}{2} \] where \(N(\cdot)\) represents the normalization operation and \(M(\cdot)\) represents the average pooling operation. 5. **Cross - Entropy Loss Function**: \[ L_{CE}=\sum_{i,j,n_{SPS}}-\log\left(\frac{\exp(z_{n_{SPS}}^{i,j}(y))}{\sum_{c = 1}^C\exp(z_{n_{SPS}}^{i,j}(c))}\right) \] where

Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization

SDPL: Shifting-Dense Partition Learning for UAV-View Geo-Localization

LODM: Large-scale Online Dense Mapping for UAV

View Distribution Alignment with Progressive Adversarial Learning for UAV Visual Geo-Localization

Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval

A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent Attention

UAV Geo-Localization Dataset and Method Based on Cross-View Matching

Jointly Optimized Global-Local Visual Localization of UAVs

Adaptive Global Embedding Learning: A Two-Stage Framework for UAV-View Geo-Localization

Leveraging Map Retrieval and Alignment for Robust UAV Visual Geo-Localization

Orientation-Guided Contrastive Learning for UAV-View Geo-Localisation

Cross-view Geo-localization via Learning Disentangled Geometric Layout Correspondence

Game4Loc: A UAV Geo-Localization Benchmark from Game Data

GeoDTR+: Toward generic cross-view geolocalization via geometric disentanglement

UAV-Satellite View Synthesis for Cross-view Geo-Localization

A Cross-View Geo-Localization Algorithm Using UAV Image and Satellite Image

Mutual Relative Position Learning Transformer for Cross-View Geo-Localization

A coarse-to-fine visual geo-localization method for GNSS-denied UAV with oblique-view imagery

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator

Navigating the Metaverse: UAV-Based Cross-View Geo-Localization in Virtual Worlds