Abstract:Traditional license plate detection and recognition models are often trained on closed datasets, limiting their ability to handle the diverse license plate formats across different regions. The emergence of large-scale pre-trained models has shown exceptional generalization capabilities, enabling few-shot and zero-shot learning. We propose OneShotLP, a training-free framework for video-based license plate detection and recognition, leveraging these advanced models. Starting with the license plate position in the first video frame, our method tracks this position across subsequent frames using a point tracking module, creating a trajectory of prompts. These prompts are input into a segmentation module that uses a promptable large segmentation model to generate local masks of the license plate regions. The segmented areas are then processed by multimodal large language models (MLLMs) for accurate license plate recognition. OneShotLP offers significant advantages, including the ability to function effectively without extensive training data and adaptability to various license plate styles. Experimental results on UFPR-ALPR and SSIG-SegPlate datasets demonstrate the superior accuracy of our approach compared to traditional methods. This highlights the potential of leveraging pre-trained models for diverse real-world applications in intelligent transportation systems. The code is available at <a class="link-external link-https" href="https://github.com/Dinghaoxuan/OneShotLP" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the limitations of traditional license plate detection and recognition models when dealing with diverse license plate formats in different regions. Specifically, traditional models are usually trained on closed datasets, which restricts their ability to handle cross - regional diverse license plate formats. In addition, different countries and regions need to develop specialized license plate recognition systems to meet traffic management requirements, which significantly increases the costs of data collection, annotation, and model training. To solve these problems, the paper proposes a training - free framework, OneShotLP, for license plate tracking and recognition in videos. This framework utilizes the strong generalization ability of pre - trained large - scale models (such as foundation models) to achieve tasks that can be completed with only one prompt. The following are the main contributions of this method: 1. **Propose a training - free license plate tracking and recognition framework**: OneShotLP only needs to mark the approximate location of the license plate in the first frame of the video, and then through the point - tracking module, the segmentation module, and the multi - modal large - language models (MLLMs), it realizes the continuous tracking and recognition of the license plate in the entire video sequence. 2. **Design a tracking and recognition pipeline for video license plate analysis**: This pipeline includes three modules - a tracking module, a segmentation module, and a recognition module. The tracking module is responsible for tracking the marked points starting from the first frame; the segmentation module generates a mask of the license plate area according to the tracking points; the recognition module uses the multi - modal large - language model to perform character recognition on the segmented license plate image. 3. **Combine foundation models for tracking, detection, and recognition**: These models can generalize from limited inputs, thus effectively handling complex and diverse traffic scenes without a large amount of retraining. Experimental results show that OneShotLP performs excellently in video license plate tracking and recognition tasks, has zero - sample learning ability, and ensures accuracy and robustness. In summary, this paper aims to build a general - purpose license plate analysis system by introducing pre - trained foundation models, in order to reduce the resource expenditures required for different types of license plate analysis and improve application efficiency.

A Training-Free Framework for Video License Plate Tracking and Recognition with Only One-Shot

A Layered Algorithm for Vehicle License Plate Locating

Deep Representation Learning for License Plate Recognition in Low Quality Video Images

A Robust Attentional Framework for License Plate Recognition in the Wild

Recognizing License Plates in Real-Time

SamLP: A Customized Segment Anything Model for License Plate Detection

A multi-filter based license plate localization and recognition framework

An End to End Recognition for License Plates Using Convolutional Neural Networks

Towards end-to-end car license plate location and recognition in unconstrained scenarios

A Robust and Efficient Approach to License Plate Detection

Towards Human-Level License Plate Recognition

ALP-Net: a segmentation-free approach for license plate recognition in unconstrained scenarios

Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline

License Plate Recognition in Wild with Super-Resolution

License Plate Recognition Based On Multi-Angle View Model

License Plate Localization in Unconstrained Scenes Using a Two-Stage CNN-RNN

A CNN-Based Approach for Automatic License Plate Recognition in the Wild

Enhancement of license plate recognition performance using Xception with Mish activation function

A lightweight license plate detection algorithm based on deep learning

Vehicle and License Plate Recognition with Novel Dataset for Toll Collection

Unified Chinese License Plate Detection and Recognition with High Efficiency