Zema Dataset: A Comprehensive Study of Yaredawi Zema with a Focus on Horologium Chants

Mequanent Argaw Muluneh,Yan-Tsung Peng,Worku Abebe Degife,Nigussie Abate Tadesse,Aknachew Mebreku Demeku,Li Su
DOI: https://doi.org/10.1109/ICT4DA62874.2024.10777238
2024-12-25
Abstract:Computational music research plays a critical role in advancing music production, distribution, and understanding across various musical styles worldwide. Despite the immense cultural and religious significance, the Ethiopian Orthodox Tewahedo Church (EOTC) chants are relatively underrepresented in computational music research. This paper contributes to this field by introducing a new dataset specifically tailored for analyzing EOTC chants, also known as Yaredawi Zema. This work provides a comprehensive overview of a 10-hour dataset, 369 instances, creation, and curation process, including rigorous quality assurance measures. Our dataset has a detailed word-level temporal boundary and reading tone annotation along with the corresponding chanting mode label of audios. Moreover, we have also identified the chanting options associated with multiple chanting notations in the manuscript by annotating them accordingly. Our goal in making this dataset available to the public 1 is to encourage more research and study of EOTC chants, including lyrics transcription, lyric-to-audio alignment, and music generation tasks. Such research work will advance knowledge and efforts to preserve this distinctive liturgical music, a priceless cultural artifact for the Ethiopian people.
Audio and Speech Processing,Information Retrieval,Signal Processing
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to fill the gap in the research on Yaredawi Zema of the Ethiopian Orthodox Tewahedo Church (EOTC) in the field of computational music research. Specifically, the paper promotes the computational analysis and understanding of EOTC chants by introducing a brand - new data set. The following are the main problems that this paper attempts to solve: 1. **Preservation of cultural and religious significance**: - EOTC chants have important cultural and religious values, but due to the lack of support from modern technology, these precious cultural heritages are at risk of being lost. The paper hopes to help protect and inherit these chants by creating a special data set. 2. **Lack of data sets**: - Although computational music research has made remarkable progress in other music styles, relatively little research has been done on EOTC chants. The paper makes up for this deficiency by providing a detailed, high - quality data set, promoting more scholars to pay attention to and study this unique music form. 3. **Multi - task support**: - This data set contains not only audio files, but also lyrics texts, pronunciation intonation annotations and corresponding chant pattern labels. This enables the data set to be used for multiple tasks, such as chant pattern classification, lyrics transcription, lyrics - audio alignment and music generation, etc. 4. **Complexity and diversity**: - EOTC chants have complex notation methods and multiple singing methods. Through detailed annotations of these complexities, the paper provides rich resources for future musicological analysis. 5. **Interdisciplinary applications**: - By combining traditional music with modern computational tools, this data set not only helps music information retrieval and generation tasks, but can also be applied to research in multiple fields such as linguistics and history, thus promoting interdisciplinary cooperation and innovation. ### Specific contributions of the data set - **10 - hour audio data**: It contains 369 instances, covering Se’atat Zema (Horologium chants), which is part of the Qidase - bet school. - **Detailed word - level time - boundary annotations**: The start and end times of each word are accurately annotated. - **Pronunciation intonation annotations**: The pronunciation intonation of each word is also annotated, which is very important for understanding the pronunciation changes in different chant patterns. - **Annotations of multiple chant patterns**: Including annotations of three main chant patterns: Ge'ez, Ezil and Araray. - **Annotations of multiple singing options**: In the case where there may be multiple singing methods for the same section of lyrics, the paper also makes detailed annotations. In short, by creating a comprehensive and high - quality data set, this paper aims to promote in - depth research and wide - range applications of EOTC chants, so as to better protect and inherit this precious cultural heritage.