A study on encoding-based oracle bone script recognition

Tingzhu Chen,Yaoyao Qian,Jingyu Pei,Shaoteng Wu,Jiang Wu,Lin Li,Jung-yueh Tu
DOI: https://doi.org/10.1177/2513850220952890
2020-12-01
Journal of Chinese Writing Systems
Abstract:Oracle bone script recognition (OBSR) has been a fundamental problem in research on oracle bone scripts for decades. Despite being intensively studied, existing OBSR methods are still subject to limitations regarding recognition accuracy, speed and robustness. Furthermore, the dependency of these methods on expert knowledge hinders the adoption of OBSR systems by the general public and also discourages social outreach of research outputs. Addressing these issues, this study proposes an encoding-based OBSR system that applies image pre-processing techniques to encode oracle images into small matrices and recognize oracle characters in the encoding space. We tested our methods on a collection of oracle bones from the Yin Ruins in XiaoTun village, and achieved a high accuracy rate of 99% within a time range of milliseconds.
What problem does this paper attempt to address?
The problems that this paper attempts to solve are several key challenges in Oracle Bone Script Recognition (OBSR), including insufficient recognition accuracy, speed and robustness, and the dependence of existing methods on expert knowledge, which hinders the popular application of OBSR systems and the expansion of their social influence. Specifically: 1. **Recognition accuracy**: When recognizing oracle bone inscriptions, existing OBSR methods have a low classification rate and limited recognition accuracy due to factors such as complex character shapes, numerous variants, and a concentration of low - frequency characters. 2. **Recognition speed**: Traditional methods require a large amount of machine memory and computing time when processing large amounts of data, and their operational efficiency is low. 3. **Robustness**: When dealing with problems such as rotation, scaling, translation (RST) and occlusion of oracle bone inscriptions images, existing methods have poor recognition effects and weak anti - interference abilities. 4. **Dependence on expert knowledge**: Current OBSR systems are highly dependent on expert knowledge, which limits their application among the general public and also hinders the social dissemination of research results. To solve these problems, the paper proposes an encoding - based oracle bone script recognition system. This system encodes oracle bone inscriptions images into small matrices through image pre - processing techniques and recognizes oracle bone characters in the encoding space. Experimental results show that this method has a significant improvement in recognition accuracy, recognition time and anti - interference ability, and has a low dependence on expert knowledge. Specific technical details include: - **Image pre - processing**: Use different templates (such as 3×3, 16×16, etc.) to segment the image and calculate the gray value to extract feature information. - **Encoding and classification**: Calculate the unique code for each character to achieve efficient and accurate recognition. - **Experimental verification**: Experiments were carried out on oracle bone inscriptions samples in "Xiaotun Village, Central South". The results show that this method can process 6,230 single - character images within 2 minutes, and the recognition rate reaches 100%. In conclusion, this paper aims to improve the performance of oracle bone script recognition in practical applications and reduce the dependence on expert knowledge by improving oracle bone script recognition technology, thereby promoting the popularization and social impact of oracle bone script research.