The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding

Kenneth Enevoldsen,Márton Kardos,Niklas Muennighoff,Kristoffer Laigaard Nielbo
2024-06-04
Abstract:The evaluation of English text embeddings has transitioned from evaluating a handful of datasets to broad coverage across many tasks through benchmarks such as MTEB. However, this is not the case for multilingual text embeddings due to a lack of available benchmarks. To address this problem, we introduce the Scandinavian Embedding Benchmark (SEB). SEB is a comprehensive framework that enables text embedding evaluation for Scandinavian languages across 24 tasks, 10 subtasks, and 4 task categories. Building on SEB, we evaluate more than 26 models, uncovering significant performance disparities between public and commercial solutions not previously captured by MTEB. We open-source SEB and integrate it with MTEB, thus bridging the text embedding evaluation gap for Scandinavian languages.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
This paper focuses on addressing the evaluation problem of multilingual text embeddings, especially for Scandinavian languages. Existing large-scale text embedding benchmarks, such as MTEB, have limitations in the evaluation and support of non-English embeddings, particularly in terms of task coverage, reproducibility, and domain coverage. To address these issues, the paper proposes the Scandinavian Embedding Benchmark (SEB), which is a comprehensive framework for evaluating text embeddings in Danish, Swedish, Norwegian, and other Scandinavian languages. SEB covers 24 tasks, 10 subtasks, and 4 task categories. Using SEB, the authors evaluate more than 26 models, revealing significant performance disparities between public and commercial solutions that were not captured by MTEB. In addition, SEB extends MTEB by adding support for multilingual embeddings and providing open-source code to promote the development of Scandinavian and multilingual embedding models.