Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks

Tristan Thrush,Kushal Tirumala,Anmol Gupta,Max Bartolo,Pedro Rodriguez,Tariq Kane,William Gaviria Rojas,Peter Mattson,Adina Williams,Douwe Kiela
DOI: https://doi.org/10.48550/arXiv.2204.01906
2022-04-05
Abstract:We introduce Dynatask: an open source system for setting up custom NLP tasks that aims to greatly lower the technical knowledge and effort required for hosting and evaluating state-of-the-art NLP models, as well as for conducting model in the loop data collection with crowdworkers. Dynatask is integrated with Dynabench, a research platform for rethinking benchmarking in AI that facilitates human and model in the loop data collection and evaluation. To create a task, users only need to write a short task configuration file from which the relevant web interfaces and model hosting infrastructure are automatically generated. The system is available at <a class="link-external link-https" href="https://dynabench.org/" rel="external noopener nofollow">this https URL</a> and the full library can be found at <a class="link-external link-https" href="https://github.com/facebookresearch/dynabench" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?