Pansori: ASR Corpus Generation from Open Online Video Contents

Yoona Choi,Bowon Lee
DOI: https://doi.org/10.48550/arXiv.1812.09798
2018-12-24
Abstract:This paper introduces Pansori, a program used to create ASR (automatic speech recognition) corpora from online video contents. It utilizes a cloud-based speech API to easily create a corpus in different languages. Using this program, we semi-automatically generated the Pansori-TEDxKR dataset from Korean TED conference talks with community-transcribed subtitles. It is the first high-quality corpus for the Korean language freely available for independent research. Pansori is released as an open-source software and the generated corpus is released under a permissive public license for community use and participation.
Audio and Speech Processing,Computation and Language,Sound
What problem does this paper attempt to address?