An aligned corpus of Spanish bibles
Gerardo Sierra,Gemma Bel-Enguix,Ameyali Díaz-Velasco,Natalia Guerrero-Cerón,Núria Bel
DOI: https://doi.org/10.1007/s10579-024-09726-y
2024-03-16
Language Resources and Evaluation
Abstract:We present a comprehensive and valuable resource in the form of an aligned parallel corpus comprising translations of the Bible in Spanish. Our collection encompasses a total of eleven Bibles, originating from diverse centuries (XVI, XIX, XX), various religious denominations (Protestant, Catholic), and geographical regions (Spain, Latin America). The process of aligning the verses across these translations has been meticulously carried out, ensuring that the content is organized in a coherent manner. As a result, this corpus serves as a useful convenient resource for various linguistic analyses, including paraphrase detection, semantic clustering, and the exploration of biases present within the texts. To illustrate the utility of this resource, we provide several examples that demonstrate how it can be effectively employed in these applications.
computer science, interdisciplinary applications