Global License Plate Dataset

Siddharth Agrawal
2024-03-22
Abstract:In the pursuit of advancing the state-of-the-art (SOTA) in road safety, traffic monitoring, surveillance, and logistics automation, we introduce the Global License Plate Dataset (GLPD). The dataset consists of over 5 million images, including diverse samples captured from 74 countries with meticulous annotations, including license plate characters, license plate segmentation masks, license plate corner vertices, as well as vehicle make, colour, and model. We also include annotated data on more classes, such as pedestrians, vehicles, roads, etc. We include a statistical analysis of the dataset, and provide baseline efficient and accurate models. The GLPD aims to be the primary benchmark dataset for model development and finetuning for license plate recognition.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper introduces a large-scale dataset called Global License Plate Dataset (GLPD) that aims to promote the development of automatic license plate recognition technology in the fields of road safety, traffic monitoring, surveillance, and logistics automation. GLPD contains over 5 million images from 74 countries, with detailed annotations such as license plate characters, license plate segmentation masks, license plate vertices, as well as vehicle brand, color, and model information. In addition, the dataset covers more categories such as pedestrians, vehicles, and roads. The researchers analyzed the limitations of existing datasets, such as small scale, lack of regional representation, and limited diversity in fonts and formats, which restrict the performance of models in real-world environments. GLPD addresses these limitations by including various environmental conditions, different languages and scripts, non-standard installation methods, and extreme scale variations, aiming to become a primary benchmark dataset for the development and fine-tuning of license plate recognition models. The Methods section describes the data collection process, primarily obtained from Platesmania.com, as well as the data annotation and verification methods. The paper also provides detailed evaluation metrics, including mean Average Precision (mAP) and end-to-end recognition accuracy. YOLOv5 was used for the detection task during model training, as well as PARSeq and CRNN for the recognition task. The Ethics Considerations and Limitations section discusses privacy protection measures, such as face blurring and watermarking, as well as efforts to balance the sample quantities of different categories in the dataset to reduce potential biases. In summary, the problem addressed by this paper is to create a large-scale, diverse, and challenging global license plate dataset to facilitate the development of more accurate and robust license plate recognition models and improve their performance in real-world applications.