JNV Corpus: A Corpus of Japanese Nonverbal Vocalizations with Diverse Phrases and Emotions

Detai Xin,Shinnosuke Takamichi,Hiroshi Saruwatari
2023-05-21
Abstract:We present JNV (Japanese Nonverbal Vocalizations) corpus, a corpus of Japanese nonverbal vocalizations (NVs) with diverse phrases and emotions. Existing Japanese NV corpora lack phrase or emotion diversity, which makes it difficult to analyze NVs and support downstream tasks like emotion recognition. We first propose a corpus-design method that contains two phases: (1) collecting NVs phrases based on crowd-sourcing; (2) recording NVs by stimulating speakers with emotional scenarios. We then collect $420$ audio clips from $4$ speakers that cover $6$ emotions based on the proposed method. Results of comprehensive objective and subjective experiments demonstrate that the collected NVs have high emotion recognizability and authenticity that are comparable to previous corpora of English NVs. Additionally, we analyze the distributions of vowel types in Japanese NVs. To our best knowledge, JNV is currently the largest Japanese NVs corpus in terms of phrase and emotion diversities.
Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Lack of diversity and emotional expression in existing Japanese Nonverbal (NV) sound corpora**: Existing Japanese NV corpora are insufficient in terms of phrase and emotional diversity, making it difficult to analyze NV and support downstream tasks such as emotion recognition. 2. **Proposing a new corpus design method**: To overcome the above issues, the authors propose a two-stage corpus design method, including collecting diverse phrases through crowdsourcing and recording NV by stimulating speakers with emotional scenarios. Through these methods, the authors constructed the JNV (Japanese Nonverbal Vocalizations) corpus, which contains 420 audio segments from 4 speakers, covering 6 basic emotions, making it the largest Japanese NV corpus in terms of phrase and emotional diversity. Additionally, the paper conducted both objective and subjective experiments to verify that the collected NVs have high emotional recognizability and authenticity, and analyzed the distribution of vowel types in Japanese NVs.