Meta AI system is a boost to endangered languages — as long as humans aren't forgotten

DOI: https://doi.org/10.1038/d41586-024-01619-y
IF: 64.8
2024-06-06
Nature
Abstract:Automated approaches to translation could provide a lifeline to under-resourced languages, but only if companies engage with the people who speak them. Automated approaches to translation could provide a lifeline to under-resourced languages, but only if companies engage with the people who speak them.
multidisciplinary sciences
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiency of low - resource languages in machine translation. Specifically, the paper focuses on how to expand the support of machine translation systems for low - resource languages, which are ignored by existing machine translation technologies due to the lack of sufficient digital resources. The paper proposes a method to increase the translation ability for these languages, including creating seed data sets by professional translators, developing techniques for mining parallel data sets from web data, and generating a list of "toxic" words for each language to identify translation content that may constitute hate speech. In addition, the paper emphasizes the importance of continuous interaction with the communities that use these languages to avoid machine translation becoming another form of "parachute science", that is, researchers in high - income countries take advantage of communities in low - income countries without continuous cooperation and support.