Detecting Omissions in Geographic Maps through Computer Vision

Phuc D. A. Nguyen,Anh Do,Minh Hoai

2024-07-15

Abstract:This paper explores the application of computer vision technologies to the analysis of maps, an area with substantial historical, cultural, and political significance. Our focus is on developing and evaluating a method for automatically identifying maps that depict specific regions and feature landmarks with designated names, a task that involves complex challenges due to the diverse styles and methods used in map creation. We address three main subtasks: differentiating maps from non-maps, verifying the accuracy of the region depicted, and confirming the presence or absence of particular landmark names through advanced text recognition techniques. Our approach utilizes a Convolutional Neural Network and transfer learning to differentiate maps from non-maps, verify the accuracy of depicted regions, and confirm landmark names through advanced text recognition. We also introduce the VinMap dataset, containing annotated map images of Vietnam, to train and test our method. Experiments on this dataset demonstrate that our technique achieves F1-score of 85.51% for identifying maps excluding specific territorial landmarks. This result suggests practical utility and indicates areas for future improvement.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper primarily explores how to use computer vision technology to detect the absence of specific regions and landmark names on maps, specifically focusing on the representation of the Hoang Sa (Paracel Islands) and Truong Sa (Spratly Islands) on Vietnamese maps. The main objectives of the study include: 1. **Distinguishing between map and non-map images**: Using Convolutional Neural Network (CNN) and transfer learning methods to identify whether the input image is a map. 2. **Verifying the accuracy of the area represented by the map**: Ensuring that the identified map indeed depicts Vietnam or its specific regions. 3. **Confirming the presence or absence of landmark names**: Using advanced text recognition technology to determine whether the specified landmark names are present on the map. To achieve the above objectives, the authors proposed a four-step processing workflow: 1. **Map Classification**: Using the EfficientNet-B4 model to classify images and determine whether they are maps of Vietnam. 2. **Text Detection**: Using the pre-trained DBNet model to detect text areas on the map and fine-tuning it to meet the needs of detecting Vietnamese text and specific landmark names. 3. **Text Recognition**: Utilizing the VietOCR tool to recognize the content of the detected text areas. 4. **Vocabulary Matching**: Comparing the recognized text with a predefined vocabulary to confirm whether the specified landmark names are included on the map. The research team also created a dataset named VinMap, which contains annotated images of Vietnamese maps, for training and testing their method. Experimental results show that the method achieved an F1 score of 85.51% in identifying Vietnamese maps lacking specific territorial landmarks (i.e., Hoang Sa and Truong Sa), indicating that the method has practical value while also pointing out directions for future improvements.

Detecting Omissions in Geographic Maps through Computer Vision

LaVIDE: A Language-Vision Discriminator for Detecting Changes in Satellite Image with Map References

Geological map feature extraction using object detection techniques - a comparative analysis

Mapping Remote Roads Using Artificial Intelligence and Satellite Imagery

Deep Convolutional Neural Networks for Map-Type Classification

Do Visual-Language Maps Capture Latent Semantics?

Aerial image geolocalization from recognition and matching of roads and intersections

Object-Oriented Semantic Mapping for Reliable UAVs Navigation

Change Detection from SPOT-Panchromatic Imagery at the Urban-rural Fringe of Ho Chi Minh City, Vietnam

Leveraging Crowdsourced GPS Data for Road Extraction from Aerial Imagery

Identifying Corresponding Patches in SAR and Optical Imagery with a Convolutional Neural Network.

Scene Retrieval for Contextual Visual Mapping

Learning from Maps: Visual Common Sense for Autonomous Driving

Predicting Maps Using In-Vehicle Cameras for Data-Driven Intelligent Transport

Discovering Place-Informative Scenes and Objects Using Social Media Photos

Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps

Automatic Search of Multiword Place Names on Historical Maps

Remote Sensing and Deep Learning to Understand Noisy OpenStreetMap

HDMapNet: A Local Semantic Map Learning and Evaluation Framework.

Road Mapping in Low Data Environments with OpenStreetMap

Towards a Meaningful 3D Map Using a 3D Lidar and a Camera