UFVL-Net: A Unified Framework for Visual Localization Across Multiple Indoor Scenes
Tao Xie,Zhiqiang Jiang,Shuozhan Li,Yukun Zhang,Kun Dai,Ke Wang,Ruifeng Li,Lijun Zhao
DOI: https://doi.org/10.1109/tim.2023.3315406
IF: 5.6
2023-10-07
IEEE Transactions on Instrumentation and Measurement
Abstract:Recently, scene coordinate regression (SCoRe) approaches for visual localization have been extensively investigated. However, current SCoRe methods are scene-specific and necessitate retraining when generalizing new scenarios, leaving a consistent rise in model capacity as the number of scenes increases. To this end, we develop UFVL-Net, a unifying framework that integrates localization tasks of multiple indoor scenarios into a manageable network and optimizes these tasks collectively under diversified scene domains, where the localization of each scenario domain is considered a separate task. UFVL-Net is storage-efficient since multiple models with shared parameters can be consolidated into a single one. Specifically, we introduce two parameter sharing policies, that is, channel-wise sharing policy (CSP) and kernel-wise sharing policy, which offer fine-grained parameter sharing within each layer of the backbone for efficient storage while providing task-specific parameters to tackle the inherent hurdles associated with multidomain learning for visual localization, that is, gradient conflict due to a skewed competition among tasks for the shared parameters. The key insight lies in that leveraging task-sharing parameters can learn a generic feature representation across scenes while utilizing task-specific parameters can learn task-related features for alleviating gradient conflict. Moreover, we develop a sign-based gradient normalization (SIGGrad) technique applied to task-sharing parameters to promote the training of UFVL-Net by further mitigating gradient conflict, thus emphasizing the utilization of task-sharing parameters and ensuring that each task is thoroughly optimized. We undertake extensive experiments across numerous datasets and complex real-world scenarios, showing that UFVL-Net families significantly outperform the cutting-edge methods with much less storage space. We demonstrate that UFVL-Net can be generalized to new scenarios using a few task-specific parameters, further highlighting the superiority of UFVL-Net. The code is available at here.
engineering, electrical & electronic,instruments & instrumentation