FEED - A Chinese Financial Event Extraction Dataset Constructed by Distant Supervision.

Guozheng Li,Peng Wang,Jiafeng Xie,Ruilong Cui,Zhenkai Deng
DOI: https://doi.org/10.1145/3502223.3502229
2021-01-01
Abstract:As an essential task in information extraction, event extraction (EE) provides abundant and valuable structured information and has been shown to be useful sources of background knowledge for applications in various domains, such as finance, legislation, health, etc. However, extracting events from domain documents is challenging since relevant information of multiple events is often scattered across multiple sentences. To this end, we release a large-scale Chinese financial event extraction dataset FEED, consisting of 31,748 documents on five financial event types derived from the Chinese financial portals, which considers the case of event arguments scattered in multiple sentences and one document containing multiple events. In order to construct FEED dataset, we first extract candidate events from financial announcements by Fonduer. Then we build an event knowledge base using weakly supervised classification, and finally label events via distant supervision. We also verify the usability of FEED and the distinguishability on baseline models. Experimental results show that FEED is challenging for existing event extraction methods, which indicates that Chinese financial event extraction remains an open problem and requires further efforts. All details and resources about FEED and event knowledge base are released at https://github.com/seukgcode/FEED.
What problem does this paper attempt to address?