Abstract:Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes.

Beyond Word for Word: Fact Guided Training for Neural Data-to-Document Generation

Incorporating Consistency Verification into Neural Data-to-Document Generation

Incorporating Consistency Verification into Neural Data-to-Document Generation.

Enhancing Neural Data-To-Text Generation Models with External Background Knowledge.

Operation-guided Neural Networks for High Fidelity Data-To-Text Generation

Refining Data for Text Generation.

PathQG: Neural Question Generation from Facts

Neural data-to-text generation with dynamic content planning

Towards information-rich, logical text generation with knowledge-enhanced neural models

WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation

RetGen: A Joint Framework for Retrieval and Grounded Text Generation Modeling

ISF-GAN: Imagine, Select, and Fuse with GPT-Based Text Enrichment for Text-to-Image Synthesis

Factuality Enhanced Language Models for Open-Ended Text Generation

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder

Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

Text Generation Based on Generative Adversarial Nets with Latent Variable

Neural models for Factual Inconsistency Classification with Explanations

Long Text Generation via Adversarial Training with Leaked Information