Ancient Chinese Character Recognition with Improved Swin-Transformer and Flexible Data Enhancement Strategies

Yi Zheng,Yi Chen,Xianbo Wang,Donglian Qi,Yunfeng Yan
DOI: https://doi.org/10.3390/s24072182
IF: 3.9
2024-03-29
Sensors
Abstract:The decipherment of ancient Chinese scripts, such as oracle bone and bronze inscriptions, holds immense significance for understanding ancient Chinese history, culture, and civilization. Despite substantial progress in recognizing oracle bone script, research on the overall recognition of ancient Chinese characters remains somewhat lacking. To tackle this issue, we pioneered the construction of a large-scale image dataset comprising 9233 distinct ancient Chinese characters sourced from images obtained through archaeological excavations. We propose the first model for recognizing the common ancient Chinese characters. This model consists of four stages with Linear Embedding and Swin-Transformer blocks, each supplemented by a CoT Block to enhance local feature extraction. We also advocate for an enhancement strategy, which involves two steps: firstly, conducting adaptive data enhancement on the original data, and secondly, randomly resampling the data. The experimental results, with a top-one accuracy of 87.25% and a top-five accuracy of 95.81%, demonstrate that our proposed method achieves remarkable performance. Furthermore, through the visualizing of model attention, it can be observed that the proposed model, trained on a large number of images, is able to capture the morphological characteristics of ancient Chinese characters to a certain extent.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges in ancient Chinese character recognition. Specifically, the paper focuses on how to improve the recognition ability of ancient Chinese characters through the improved Swin - Transformer model and flexible data augmentation strategies. Ancient Chinese characters, such as oracle bone inscriptions and bronze inscriptions, are significantly different from modern Chinese characters in form, and due to the span of historical periods, ancient Chinese characters in different periods also vary in writing styles. This makes it difficult for the general public and ancient character enthusiasts to understand and study the historical evolution of each character. In addition, for other types of ancient Chinese characters except oracle bone inscriptions, such as bronze inscriptions, coin inscriptions, stone inscriptions, bamboo slips, wooden tablets and silk - book characters, there is currently no comprehensive large - scale image data set. To solve these problems, the main contributions of the paper are as follows: 1. **Constructing a large - scale ancient Chinese character image data set**: Various forms of ancient characters are collected, showing the diversity of ancient Chinese characters, and the first large - scale Chinese ancient character data set containing 9,233 classes and more than 970,000 instances is constructed. 2. **Proposing an improved Swin - Transformer model**: It is used for feature extraction, and a data augmentation strategy is adopted to solve the long - tail distribution problem of ancient Chinese characters. 3. **Applying the deep - learning model for the first time**: Ancient Chinese character recognition is carried out on a large - scale data set, aiming to develop a deep - learning network to analyze the internal commonalities of ancient Chinese characters. Through these methods, the paper aims to improve the recognition accuracy of ancient Chinese characters and provide strong support for the study of ancient Chinese characters. The experimental results show that the model reaches 87.25% in Top - 1 accuracy and 95.81% in Top - 5 accuracy, indicating that it has significant performance advantages.