Implementation of Tibetan-Chinese Translation Platform Based on LSTM Algorithm

XiaoFeng Chen,Hao Wang,Wei Xiang
DOI: https://doi.org/10.1145/3321408.3326670
2019-01-01
Abstract:With the rapid economic development and increasingly frequent language exchanges in Tibet, the traditional statistical machine translation methods are faced with problems such as lack of data and over-fitting of training, resulting in poor translation quality. Combined with the current development of natural language processing (NLP), the LSTM algorithm based on Google's TensorFlow framework is proposed to realize the Tibetan and Chinese neural machine translation method. In the preprocessing stage of corpus, word segmentation module is constructed for tibetan-chinese bilingual parallel corpus by using the combination algorithm of gated circulatory neural network (GRU) and conditional random field (CRF). In the model construction stage, the LSTM method is used to construct the model. The experimental results show that the short and long time memory networks have good translation effects.
What problem does this paper attempt to address?