Abstract:Graph-based neural networks and unsupervised pre-trained models are both cutting-edge text representation methods, given their outstanding ability to capture global information and contextualized information, respectively. However, both representation methods meet obstacles to further performance improvements. On one hand, graph-based neural networks lack knowledge orientation to guide textual interpretation during global information interaction. On the other hand, unsupervised pre-trained models imply rich semantic and syntactic knowledge which lacks sufficient induction and expression. Therefore, how to effectively integrate graph-based global information and unsupervised contextualized semantic and syntactic information to achieve better text representation is an important issue pending for solution. In this paper, we propose a representation method that deeply integrates Unsupervised Semantics and Syntax into heterogeneous Graphs (USS-Graph) for inductive text classification. By constructing a heterogeneous graph whose edges and nodes are totally generated by knowledge from unsupervised pre-trained models, USS-Graph can harmonize the two perspectives of information under a bidirectionally weighted graph structure and thereby realizing the intra-fusion of graph-based global information and unsupervised contextualized semantic and syntactic information. Based on USS-Graph, we also propose a series of optimization measures to further improve the knowledge integration and representation performance. Extensive experiments conducted on benchmark datasets show that USS-Graph consistently achieves state-of-the-art performances on inductive text classification tasks. Additionally, extended experiments are conducted to deeply analyze the characteristics of USS-Graph and the effectiveness of our proposed optimization measures for further knowledge integration and information complementation.

An Iterative Graph Learning Convolution Network for Key Information Extraction Based on the Document Inductive Bias.

A Character-Level Document Key Information Extraction Method with Contrastive Learning.

Graph Convolution for Multimodal Information Extraction from Visually Rich Documents

The image annotation algorithm using convolutional features from intermediate layer of deep learning

RIECN: Learning Relation-Based Interactive Embedding Convolutional Network for Knowledge Graph.

GraphRevisedIE: Multimodal Information Extraction with Graph-Revised Network

A Joint Learning Information Extraction Method Based on an Effective Inference Structure

JointE: Jointly utilizing 1D and 2D convolution for knowledge graph embedding

GraphIE: A Graph-Based Framework for Information Extraction

One-shot Key Information Extraction from Document with Deep Partial Graph Matching

Research on a Knowledge Graph Embedding Method Based on Improved Convolutional Neural Networks for Hydraulic Engineering

A Regularization-based Transfer Learning Method for Information Extraction via Instructed Graph Decoder

CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor

Deeply Integrating Unsupervised Semantics and Syntax into Heterogeneous Graphs for Inductive Text Classification

Knowledge graph embedding model with attention-based high-low level features interaction convolutional network

Inductive hierarchical nonnegative graph embedding for “verb–object” image classification

TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents

EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images

Inductive Relation Inference of Knowledge Graph Enhanced by Ontology Information

Deep graph layer information mining convolutional network

Exploring Effective Inter-Encoder Semantic Interaction for Document-Level Relation Extraction