Abstract:Automating the checkout process is important in smart retail, where users effortlessly pass products by hand through a camera, triggering automatic product detection, tracking, and counting. In this emerging area, due to the lack of annotated training data, we introduce a dataset comprised of product 3D models, which allows for fast, flexible, and large-scale training data generation through graphic engine rendering. Within this context, we discern an intriguing facet, because of the user "hands-on" approach, bias in user behavior leads to distinct patterns in the real checkout process. The existence of such patterns would compromise training effectiveness if training data fail to reflect the same. To address this user bias problem, we propose a training data optimization framework, i.e., training with digital twins (DtTrain). Specifically, we leverage the product 3D models and optimize their rendering viewpoint and illumination to generate "digital twins" that visually resemble representative user images. These digital twins, inherit product labels and, when augmented, form the Digital Twin training set (DT set). Because the digital twins individually mimic user bias, the resulting DT training set better reflects the characteristics of the target scenario and allows us to train more effective product detection and tracking models. In our experiment, we show that DT set outperforms training sets created by existing dataset synthesis methods in terms of counting accuracy. Moreover, by combining DT set with pseudo-labeled real checkout data, further improvement is observed. The code is available at <a class="link-external link-https" href="https://github.com/yorkeyao/Automated-Retail-Checkout" rel="external noopener nofollow">this https URL</a>.

Take Goods from Shelves

Take Goods from Shelves: A Dataset for Class-Incremental Object Detection.

Object Detection for Vision-Aided Inventory Counting

Toward New Retail: A Benchmark Dataset for Smart Unmanned Vending Machines

Unitail: Detecting, Reading, and Matching in Retail Scene

Object detection and recognition system based on computer vision analysis

A Hierarchical Grocery Store Image Dataset with Visual and Semantic Labels

A Design of Smart Unmanned Vending Machine for New Retail Based on Binocular Camera and Machine Vision

SCD: A Stacked Carton Dataset for Detection and Segmentation

Fine-Grained Grocery Product Recognition by One-Shot Learning.

Matryoshka Peek: Toward Learning Fine-Grained, Robust, Discriminative Features for Product Search

Efficient Defect Detection of Rotating Goods under the Background of Intelligent Retail

Training with Product Digital Twins for AutoRetail Checkout

Enhanced Self-Checkout System for Retail Based on Improved YOLOv10

Products-10K: A Large-scale Product Recognition Dataset

Multimodal fine-grained grocery product recognition using image and OCR text

RP2K: A Large-Scale Retail Product Dataset for Fine-Grained Image Classification

Deconvolution Single Shot Multibox Detector for Supermarket Commodity Detection and Classification

Design of Smart Unstaffed Retail Shop Based on IoT and Artificial Intelligence

Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding

A Benchmark Grocery Dataset of Realworld Point Clouds From Single View