Automatic Parallel Corpus Creation for Hindi-English News Translation Task

Aditya Kumar Pathak,Priyankit Acharya,Dilpreet Kaur,Rakesh Chandra Balabantaray
DOI: https://doi.org/10.48550/arXiv.1901.08625
2019-01-24
Computation and Language
Abstract:The parallel corpus for multilingual NLP tasks, deep learning applications like Statistical Machine Translation Systems is very important. The parallel corpus of Hindi-English language pair available for news translation task till date is of very limited size as per the requirement of the systems are concerned. In this work we have developed an automatic parallel corpus generation system prototype, which creates Hindi-English parallel corpus for news translation task. Further to verify the quality of generated parallel corpus we have experimented by taking various performance metrics and the results are quite interesting.
What problem does this paper attempt to address?