Arbitrary-shaped Scene Text Detection with Keypoint-Based Shape Representation

Qin, Shuxin,Chen, Lin
DOI: https://doi.org/10.1007/s10032-022-00396-6
2022-01-01
International Journal on Document Analysis and Recognition (IJDAR)
Abstract:Recently scene text detection has become a hot research topic. Arbitrary-shaped text detection is more challenging due to the irregular geometry of the texts such as long curved shapes. Most existing works attempt to solve the problem by using bottom-up methods, followed by heuristic post-processing, or top-down methods with boundary regression. Through analysis and comparison, we present an efficient framework to detect arbitrary-shaped text by fusing bottom-up and top-down methods. Specifically, we use a segmentation method as the bottom-up detector to regress the text areas. We employ an anchor-free method as the top-down detector to represent and distinguish each text based on the results of bottom-up detector. To detect text with arbitrary shapes, we propose a keypoint-based shape representation method, which treats a text as several keypoints linked together. Then, keypoints are regressed by the top-down detector. With the keypoint-based shape representation, the detected text can be easily rectified by Thin Plate Spline (TPS) transformation, and the framework can be directly extended to support end-to-end text spotting. Extensive experiments on several public benchmarks, including both regular-shaped and arbitrary-shaped scene texts in natural images, demonstrate that our method has achieved state-of-the-art performance .
What problem does this paper attempt to address?