Speechfind: an experimental on-line spoken document retrieval system for historical audio archives.

Bowen Zhou,John H. L. Hansen
DOI: https://doi.org/10.21437/icslp.2002-542
2002-01-01
Abstract:In this study, we present the SpeechFind system, an experimental on-line spoken document retrieval system for historical audio archives. As part of an on-going U.S. NSF Digital Library Initiative project, entitled the National Gallery of the Spoken Word (NGSW), SpeechFind is intended to serve as an audio index and search engine for spoken word collections spanning the 20th century with as much as 60,000 hours of audio archives. In this paper, we describe the system architecture of SpeechFind, with focus on audio data transcription and information retrieval components. Using a sample test audio data collection from the past 60 years, an evaluation of individual system components and overall performance is presented.
What problem does this paper attempt to address?