SETRN: A Transform Structure with Adaptive 3D Attention Mechanism and Global Semantic Supervision for Mobile-Captured Retail Receipts Recognition

Lujiao Shao,Haijun Zhang,Chunxin Zhang,Han Yan,Yanxia Sun
DOI: https://doi.org/10.1109/ispce-asia60405.2023.10365849
2023-01-01
Abstract:Retail receipts serve as important vouchers issued by shopping malls. They accurately record key transaction information, enabling effective analysis of user behavior patterns and merchant sales. Data mining and analysis of receipts offer an effective approach to optimize shopping mall operation strategies and enhance consumers’ shopping experience. While the current mainstream text recognition methods for scanned documents perform well, they face significant challenges when applied to receipt images captured by mobile devices, such as cell phones. To achieve greater accuracy, this research proposes a novel method for global semantic feature-based supervised text recognition. It incorporates a 3D attention module to extract high-dimensional visual features and a pure text image supervised encoder to extract global semantic features. The experiment results on both public benchmarks and a large real-world receipt text recognition dataset show that this method is robust and achieves high accuracy.
What problem does this paper attempt to address?