Learning To Extract Attribute Values From A Search Engine With Few Examples

Xingxing Zhang,Tao Ge,Zhifang Sui
DOI: https://doi.org/10.1007/978-3-642-41491-6_15
2013-01-01
Abstract:We propose an attribute value extraction method based on analysing snippets from a search engine. First, a pattern based detector is applied to locate the candidate attribute values in snippets. Then a classifier is used to predict whether a candidate value is correct. To train such a classifier, only very few annotated <entity, attribute, value> triples are needed, and sufficient training data can be generated automatically by matching these triples back to snippets and titles. Finally, as a correct value may appear in multiple snippets, to exploit such redundant information, all the individual predictions are assembled together by voting. Experiments on both Chinese and English corpora in the celebrity domain demonstrate the effectiveness of our method: with only 15 annotated <entity, attribute, value> triples, 7 of 12 attributes' precisions are over 85%; Compared to a state-of-the-art method, 11 of 12 attributes have improvements.
What problem does this paper attempt to address?