PuzzLing Machines: A Challenge on Learning From Small Data

Gözde Gül Şahin,Yova Kementchedjhieva,Phillip Rust,Iryna Gurevych
DOI: https://doi.org/10.48550/arXiv.2004.13161
2020-04-28
Abstract:Deep neural models have repeatedly proved excellent at memorizing surface patterns from large datasets for various ML and NLP benchmarks. They struggle to achieve human-like thinking, however, because they lack the skill of iterative reasoning upon knowledge. To expose this problem in a new light, we introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students. These puzzles are carefully designed to contain only the minimal amount of parallel text necessary to deduce the form of unseen expressions. Solving them does not require external information (e.g., knowledge bases, visual signals) or linguistic expertise, but meta-linguistic awareness and deductive skills. Our challenge contains around 100 puzzles covering a wide range of linguistic phenomena from 81 languages. We show that both simple statistical algorithms and state-of-the-art deep neural models perform inadequately on this challenge, as expected. We hope that this benchmark, available at <a class="link-external link-https" href="https://ukplab.github.io/PuzzLing-Machines/" rel="external noopener nofollow">this https URL</a>, inspires further efforts towards a new paradigm in NLP---one that is grounded in human-like reasoning and understanding.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiencies of current deep - learning models when dealing with small - data sets, especially the lack of the ability to perform iterative reasoning like humans. Specifically, the authors introduced a challenge named "PuzzLing Machines", which is based on the "Rosetta Stone" puzzles in the High School Students' Language Olympics. These puzzles are very delicately designed and contain only the minimum amount of parallel text required to solve new forms of expression. Solving these puzzles does not require external information or linguistic expertise, but requires meta - linguistic awareness and deductive skills. Through this challenge, the author hopes to inspire researchers to develop new deep - learning models that can reason in a human - like manner and operate based on language knowledge, thus promoting research in the field of natural language processing in new directions.