Commonsense Knowledge Construction with Concept and Pretrained Model.

Hanjun Cai,Feng Zhao,Hai Jin
DOI: https://doi.org/10.1007/978-3-031-20309-1_4
2022-01-01
Abstract:Commonsense knowledge (CSK) is the information that people use in daily life but do not often mention. It summarizes the practical knowledge about how the world works. Existing machines have knowledge but lack commonsense because they do not understand and master commonsense knowledge in the same way that humans do. In the latest works, crowdsourcing-based method is costly and has low coverage, knowledge base completion method can highly fit samples, and methods extracted from unstructured data have the defects of low quality. CG &BF is commonsense knowledge construction with a concept-based generator and a BERT-based filter. We utilize semantic search for node matching and entropy encoder for filtering triples with high abstraction. Two algorithms based on concept aggregation and path credibility are proposed to obtain high-quality CSK triples. We subsequently finetuning a BERT to filter incorrect triples. We obtain 500,000 CSK triples based on ConceptNet, which is superior to other construction methods in novelty and quality. In the reading comprehension task, the three-way attention network is selected as the basic model and the knowledge we generate enables the base model to perform better, which proves that the output of CG &BF has higher quality and ease of use.
What problem does this paper attempt to address?