Abstract:Considering the complex structure of Chinese characters, particularly the connections and intersections between strokes, there are challenges in low accuracy of Chinese character stroke extraction and recognition, as well as unclear segmentation. This study builds upon the YOLOv8n-seg model to propose the YOLOv8n-seg-CAA-BiFPN Chinese character stroke fine segmentation model. The proposed Coordinate-Aware Attention mechanism (CAA) divides the backbone network input feature map into four parts, applying different weights for horizontal, vertical, and channel attention to compute and fuse key information, thus capturing the contextual regularity of closely arranged stroke positions. The network's neck integrates an enhanced weighted bi-directional feature pyramid network (BiFPN), enhancing the fusion effect for features of strokes of various sizes. The Shape-IoU loss function is adopted in place of the traditional CIoU loss function, focusing on the shape and scale of stroke bounding boxes to optimize the bounding box regression process. Finally, the Grad-CAM++ technique is used to generate heatmaps of segmentation predictions, facilitating the visualization of effective features and a deeper understanding of the model's focus areas. Trained and tested on the public Chinese character stroke datasets CCSE-Kai and CCSE-HW, the model achieves an average accuracy of 84.71%, an average recall rate of 83.65%, and a mean average precision of 80.11%. Compared to the original YOLOv8n-seg and existing mainstream segmentation models like SegFormer, BiSeNetV2, and Mask R-CNN, the average accuracy improved by 3.50%, 4.35%, 10.56%, and 22.05%, respectively; the average recall rates improved by 4.42%, 9.32%, 15.64%, and 24.92%, respectively; and the mean average precision improved by 3.11%, 4.15%, 8.02%, and 19.33%, respectively. The results demonstrate that the YOLOv8n-seg-CAA-BiFPN network can accurately achieve Chinese character stroke segmentation.

Uyghur Character Models with Shared Structure Information for Segmentation-free Recognition under Low Data Resource Conditions

Multi-font Multi-Size Printed Uyghur Character Recognition

Recognition of A Limited Chinese Character Set Based on Improved CLAFIC-LSM Algorithm

Design and implementation of prototype system for online handwritten Uyghur character recognition

Scene Uyghur Recognition Based on Visual Prediction Enhancement

Single Character Font Recognition of Character Dependent

Uyghur, Chinese and English Multilingual Document Recognition

Content-independent font recognition on a single Chinese character using sparse representation

Multi-Step Segmentation Method Based on Adaptive Thresholds for Chinese Calligraphy Characters.

Multi-font Printed Mongolian Document Recognition System

Fine Segmentation of Chinese Character Strokes Based on Coordinate Awareness and Enhanced BiFPN

LCSegNet: An Efficient Semantic Segmentation Network for Large-Scale Complex Chinese Character Recognition

A Novel Method Based on Character Segmentation for Slant Chinese Screen-render Text Detection and Recognition

An approach for handwritten Chinese text recognition unifying character segmentation and recognition

Graph Model Optimization Based Historical Chinese Character Segmentation Method

Research on a Chinese Character Recognition Framework based on Multi-dimensional Image Information

Post Processing for Offline Chinese Handwritten Character String Recognition

Local Projection-Based Character Segmentation Method for Historical Chinese Documents.

Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning

Chinese Character Recognition with Augmented Character Profile Matching

Recognition of Handwritten Chinese Text by Segmentation: A Segment-annotation-free Approach