Abstract:Considering the complex structure of Chinese characters, particularly the connections and intersections between strokes, there are challenges in low accuracy of Chinese character stroke extraction and recognition, as well as unclear segmentation. This study builds upon the YOLOv8n-seg model to propose the YOLOv8n-seg-CAA-BiFPN Chinese character stroke fine segmentation model. The proposed Coordinate-Aware Attention mechanism (CAA) divides the backbone network input feature map into four parts, applying different weights for horizontal, vertical, and channel attention to compute and fuse key information, thus capturing the contextual regularity of closely arranged stroke positions. The network's neck integrates an enhanced weighted bi-directional feature pyramid network (BiFPN), enhancing the fusion effect for features of strokes of various sizes. The Shape-IoU loss function is adopted in place of the traditional CIoU loss function, focusing on the shape and scale of stroke bounding boxes to optimize the bounding box regression process. Finally, the Grad-CAM++ technique is used to generate heatmaps of segmentation predictions, facilitating the visualization of effective features and a deeper understanding of the model's focus areas. Trained and tested on the public Chinese character stroke datasets CCSE-Kai and CCSE-HW, the model achieves an average accuracy of 84.71%, an average recall rate of 83.65%, and a mean average precision of 80.11%. Compared to the original YOLOv8n-seg and existing mainstream segmentation models like SegFormer, BiSeNetV2, and Mask R-CNN, the average accuracy improved by 3.50%, 4.35%, 10.56%, and 22.05%, respectively; the average recall rates improved by 4.42%, 9.32%, 15.64%, and 24.92%, respectively; and the mean average precision improved by 3.11%, 4.15%, 8.02%, and 19.33%, respectively. The results demonstrate that the YOLOv8n-seg-CAA-BiFPN network can accurately achieve Chinese character stroke segmentation.

Multiscale Fully Convolutional Network‐based Approach for Multilingual Character Segmentation

Multi-Scale and Multi-Branch Convolutional Neural Network for Retinal Image Segmentation

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

LCSegNet: An Efficient Semantic Segmentation Network for Large-Scale Complex Chinese Character Recognition

Fine Segmentation of Chinese Character Strokes Based on Coordinate Awareness and Enhanced BiFPN

Multiscale Fusion Convolutional Network in Real-time Semantic Segmentation

Multi-Step Segmentation Method Based on Adaptive Thresholds for Chinese Calligraphy Characters.

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

Image Segmentation of Liver CT Based on Fully Convolutional Network.

Handwritten Multi-Scale Chinese Character Detector with Blended Region Attention Features and Light-Weighted Learning

Scene Text Recognition with Sliding Convolutional Character Models

A Deep Convolutional Neural Model for Character-Based Chinese Word Segmentation

Multiscale Global Context Network for Semantic Segmentation of High-Resolution Remote Sensing Images

A General Framework For Multi-Character Segmentation And Its Application In Recognizing Multilingual Asian Documents

Encoder- and Decoder-Based Networks Using Multiscale Feature Fusion and Nonlocal Block for Remote Sensing Image Semantic Segmentation

A Novel Method Based on Character Segmentation for Slant Chinese Screen-render Text Detection and Recognition

A Multi-Scale Hybrid Attention Network for Sentence Segmentation Line Detection in Dongba Scripture

Learning Cross-Channel Representations for Semantic Segmentation

MPCCN: A Symmetry-Based Multi-Scale Position-Aware Cyclic Convolutional Network for Retinal Vessel Segmentation

MSCFNet: A Lightweight Network with Multi-Scale Context Fusion for Real-Time Semantic Segmentation

Chinese Character Components Segmentation Method Based on Faster RCNN