Detecting Omissions in Geographic Maps through Computer Vision

Phuc D. A. Nguyen,Anh Do,Minh Hoai
2024-07-15
Abstract:This paper explores the application of computer vision technologies to the analysis of maps, an area with substantial historical, cultural, and political significance. Our focus is on developing and evaluating a method for automatically identifying maps that depict specific regions and feature landmarks with designated names, a task that involves complex challenges due to the diverse styles and methods used in map creation. We address three main subtasks: differentiating maps from non-maps, verifying the accuracy of the region depicted, and confirming the presence or absence of particular landmark names through advanced text recognition techniques. Our approach utilizes a Convolutional Neural Network and transfer learning to differentiate maps from non-maps, verify the accuracy of depicted regions, and confirm landmark names through advanced text recognition. We also introduce the VinMap dataset, containing annotated map images of Vietnam, to train and test our method. Experiments on this dataset demonstrate that our technique achieves F1-score of 85.51% for identifying maps excluding specific territorial landmarks. This result suggests practical utility and indicates areas for future improvement.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily explores how to use computer vision technology to detect the absence of specific regions and landmark names on maps, specifically focusing on the representation of the Hoang Sa (Paracel Islands) and Truong Sa (Spratly Islands) on Vietnamese maps. The main objectives of the study include: 1. **Distinguishing between map and non-map images**: Using Convolutional Neural Network (CNN) and transfer learning methods to identify whether the input image is a map. 2. **Verifying the accuracy of the area represented by the map**: Ensuring that the identified map indeed depicts Vietnam or its specific regions. 3. **Confirming the presence or absence of landmark names**: Using advanced text recognition technology to determine whether the specified landmark names are present on the map. To achieve the above objectives, the authors proposed a four-step processing workflow: 1. **Map Classification**: Using the EfficientNet-B4 model to classify images and determine whether they are maps of Vietnam. 2. **Text Detection**: Using the pre-trained DBNet model to detect text areas on the map and fine-tuning it to meet the needs of detecting Vietnamese text and specific landmark names. 3. **Text Recognition**: Utilizing the VietOCR tool to recognize the content of the detected text areas. 4. **Vocabulary Matching**: Comparing the recognized text with a predefined vocabulary to confirm whether the specified landmark names are included on the map. The research team also created a dataset named VinMap, which contains annotated images of Vietnamese maps, for training and testing their method. Experimental results show that the method achieved an F1 score of 85.51% in identifying Vietnamese maps lacking specific territorial landmarks (i.e., Hoang Sa and Truong Sa), indicating that the method has practical value while also pointing out directions for future improvements.