When is Wall a Pared and when a Muro? -- Extracting Rules Governing Lexical Selection

Aditi Chaudhary,Kayo Yin,Antonios Anastasopoulos,Graham Neubig
DOI: https://doi.org/10.48550/arXiv.2109.06014
2021-09-13
Abstract:Learning fine-grained distinctions between vocabulary items is a key challenge in learning a new language. For example, the noun "wall" has different lexical manifestations in Spanish -- "pared" refers to an indoor wall while "muro" refers to an outside wall. However, this variety of lexical distinction may not be obvious to non-native learners unless the distinction is explained in such a way. In this work, we present a method for automatically identifying fine-grained lexical distinctions, and extracting concise descriptions explaining these distinctions in a human- and machine-readable format. We confirm the quality of these extracted descriptions in a language learning setup for two languages, Spanish and Greek, where we use them to teach non-native speakers when to translate a given ambiguous word into its different possible translations. Code and data are publicly released here (<a class="link-external link-https" href="https://github.com/Aditi138/LexSelection" rel="external noopener nofollow">this https URL</a>)
Computation and Language
What problem does this paper attempt to address?