Speech2Stroke: Generate Chinese Character Strokes Directly from Speech

Yinhui Zhang,Wei Xi,Zhao Yang,Sitao Men,Rui Jiang,Yuxin Yang,Jizhong Zhao
DOI: https://doi.org/10.1007/978-3-030-67537-0_6
2021-01-01
Abstract:Chinese character is composed of spatial arrangement of strokes. A portion of these strokes combines to form phonetic component, which provides a clue to the pronunciation of the entire character, the others combine to form semantic component, which indicates semantic level information for speech context. How closely the connection between the internal strokes of Chinese characters and speech? In this paper, we propose Speech2Stroke, a end-to-end model that exploits the phonetic and morphologic level information of pictographic words. Specifically, we generate strokes directly from the speech by Speech2Stroke. The performance of Speech2Stroke is evaluated by the specific stroke error rate(SER). The SER of the optimal model can achieve 20.61%. Through the experiments and analysis, we show that our model has the ability to capture the alignment between audio and the internal structures of pictographic characters.
What problem does this paper attempt to address?