Multimodal Question Answering over Structured Data with Ambiguous Entities

Huadong Li,Yafang Wang,Gerard de Melo,Changhe Tu,Baoquan Chen
DOI: https://doi.org/10.1145/3041021.3054135
2017-01-01
Abstract:In recent years, we have witnessed profound changes in the way people satisfy their information needs. For instance, with the ubiquitous 24/7 availability of mobile devices, the number of search engine queries on mobile devices has reportedly overtaken that of queries on regular personal computers. In this paper, we consider the task of multimodal question answering over structured data, in which a user supplies not just a natural language query but also an image. Our system addresses this by optimizing a non-convex objective function capturing multimodal constraints. Our experiments show that this enables it to answer even very challenging ambiguous entity queries with high accuracy.
What problem does this paper attempt to address?