Cross-view scene image localization with Triplet Network integrating NetVLAD and Fully Connected Layers

XUE Zhaohui,ZHOU Yiyang,QIANG Yonggang,LIU Yifeng,LIN Hui
DOI: https://doi.org/10.11834/jrs.20210188
2021-01-01
Journal of Remote Sensing
Abstract:ç ”ç©¶åœºæ™¯å›¾åƒçš„åœ°ç†å®šä½é—®é¢˜åœ¨å®¤å¤–å®šä½ã€ç›®æ ‡æœå¯»ã€å†›äº‹ä¾¦å¯Ÿç­‰é¢†åŸŸå ·æœ‰é‡è¦æ„ä¹‰ã€‚é’ˆå¯¹è¡—æ™¯å½±åƒä¸Žé¸Ÿçž°å½±åƒä¹‹é—´çš„äº¤å‰è§†è§’åœºæ™¯å›¾åƒåŒ¹é ä¸Žå®šä½é—®é¢˜ï¼Œæœ¬æ–‡æå‡ºäº†ä¸€ç§èžåˆå¯è®­ç»ƒå±€éƒ¨èšé›†æè¿°å­å‘é‡NetVLAD(Net Vector of locally aggregated descriptorsï¼‰å’Œå ¨è¿žæŽ¥å±‚çš„ä¸‰å ƒç¥žç»ç½‘ç»œï¼ˆTriplet Network)定位方法(Tri-NetVLADï¼‰ã€‚ä¸‰å ƒç¥žç»ç½‘ç»œç”±ä¸‰ç»„å·ç§¯ç¥žç»ç½‘ç»œCNN(Convolutional Neural Networks)构成,能同时处理3å¼ å½±åƒï¼Œé€šè¿‡å¢žå¤§ä¸åŒ¹é åƒå¯¹é—´çš„è·ç¦»ï¼Œå‡å°åŒ¹é åƒå¯¹é—´çš„è·ç¦»ï¼Œå®žçŽ°å›¾åƒæ£€ç´¢ä¸ŽåŒ¹é ï¼›NetVLADå’Œå ¨è¿žæŽ¥å±‚çš„èžåˆå¯ä»¥åŠ å¼ºç‰¹å¾é—´çš„å ³è”æ€§ã€‚æœ¬æ–‡å°†CNN提取的局部卷积特征分别通过NetVLADå±‚å’Œå ¨è¿žæŽ¥å±‚å¾—åˆ°å ¨å±€æè¿°ç¬¦ä¸Žç‰¹å¾å‘é‡ï¼Œå¹¶å°†äºŒè€ èžåˆï¼Œæœ‰æ•ˆåœ°æå‡äº†å±€éƒ¨ç‰¹å¾é—´çš„å ³è”æ€§ï¼Œå¹¶ä¿ç•™äº†ä¸åŒå±€éƒ¨ç‰¹å¾ä¹‹é—´çš„å·®å¼‚æ€§ï¼Œæå‡äº†æ¨¡åž‹çš„å®šä½ç²¾åº¦ï¼›æ”¹è¿›äº†DBL loss(Distance-based layer lossï¼‰ï¼Œé€šè¿‡åŠ å ¥å‚æ•°Î»å¢žå¼ºå‡½æ•°åˆ¤åˆ«å›°éš¾æ ·æœ¬çš„èƒ½åŠ›ï¼Œåœ¨æå‡æ¨¡åž‹çš„æ”¶æ•›é€Ÿåº¦å’Œç¨³å®šæ€§çš„åŒæ—¶ä¹Ÿæå‡äº†æ¨¡åž‹çš„å®šä½ç²¾åº¦ã€‚åœ¨ç¾Žå›½Vo and Hayså ¬å¼€æ•°æ®é›†ä¸Šçš„å®žéªŒç»“æžœè¡¨æ˜Žï¼ŒTri-NetVLAD取得了优于MCVPlaces、Triplet eDBL-Net和CVM-Net等现有方法的定位精度,在测试集上的精度高于63%。
What problem does this paper attempt to address?