Multimedia question answering

Richang Hong, Meng Wang, Guangda Li, Liqiang Nie, Zheng-Jun Zha, Tat-Seng Chua
2012-10-01
Abstract:Yahoo, users can easily become overwhelmed. Question-answering (QA) research attempts to tackle this information-overload problem. Instead of returning a ranked list of documents, as with current search engines, QA leverages advanced media content, linguistic analysis, and domain knowledge to return precise answers to users’ natural-language questions. However, to date, QA research has largely focused on text. Given that the vast amount of information on the Web is now in multimedia form, it is natural to extend text-based QA research to multimedia QA (MMQA).(We identify all types of answers except pure text as multimedia answers, including images, video, images and text, and so forth.) Further MMQA research must bear in mind several key points. 1 First, we must manage incomplete metadata and clean up noisy annotations. Second, appropriate multimedia answers are more intuitive for some questions. Third, multimedia answers are readily available for some types of questions given the popularity of video-and imagesharing sites. Thus, MMQA can complement text QA in a complete QA paradigm in which the best answers might be a combination of text and other mediums. Thus far, few works have addressed MMQA services. Hui Yang and his colleagues presented an early system specifically designed to address video QA for news video. 2 Their work follows an architecture similar to text-based QA with video content analysis being performed at various stages of the QA pipeline. Following this work, several video QA systems were proposed, most of which relied on the use of textual transcripts derived from video optical character …
What problem does this paper attempt to address?