Abstract:A critical ideology of the existing Material Genome Project refers to the application of data and artificial intelligence to facilitate material innovation. The lack of data hinders the development of novel materials. The figures and captions in the material literature cover essential information regarding the entire document and have sufficient image sample data for research. Accordingly, how to extract figures and captions from the literature is critical to solve the lack of data. Though some PDF parsing tools are capable of extracting information from documents, they generally identify a document's figures by parsing the document into a concrete structure. As impacted by the inconsistency of the form of different journals, they commonly achieve wrong recognition results. Thus, an efficient figure and caption extraction network FCENet is proposed in the present study. Inconsistent with other extraction tools, this study first attempts to adopt instance segmentation models to detect figures and their captions, and then extract them. FCENet developed in this study builds upon BlendMask and introduces a horizontal and vertical attention module. This study splits the BlendMask detection head into two branches, i.e., figure detection and caption detection, which increases final detection accuracy and speed. This study collects nearly 3000 material documents for model training and testing. As revealed from the last experiments and results, the performance of FCENet is significantly compared with that of other existing instance segmentation models. Its box and mask mAP (mean Average Precision) are 8.51% and 12.59% higher than those of BlendMask, respectively. This study hopes that considerable material image data can be acquired via FCENet and sufficiently support image data for machine learning and data mining in the material area.

ECENet: Explainable and Context-Enhanced Network for Muti-modal Fact Verification

ESCNet: Entity-enhanced and Stance Checking Network for Multi-modal Fact-Checking

Multi-source Knowledge Enhanced Graph Attention Networks for Multimodal Fact Verification

Ecarnet: enhanced clue-ambiguity reasoning network for multimodal fake news detection

GRAPH ATTENTION AND INTERACTION NETWORK WITH MULTI-TASK LEARNING FOR FACT VERIFICATION

A Coarse-to-fine Cascaded Evidence-Distillation Neural Network for Explainable Fake News Detection

Multi-Evidence based Fact Verification via A Confidential Graph Neural Network

A Context-Enriched Neural Network Method for Recognizing Lexical Entailment.

Program Enhanced Fact Verification with Verbalization and Graph Attention Network

EMIF: Evidence-aware Multi-source Information Fusion Network for Explainable Fake News Detection

End-to-End Multimodal Fact-Checking and Explanation Generation: A Challenging Dataset and Models

Question guided multimodal receptive field reasoning network for fact-based visual question answering

A Knowledge Enhanced Learning and Semantic Composition Model for Multi-Claim Fact Checking

Kernel Graph Attention Network for Fact Verification

Improving Network Interpretability via Explanation Consistency Evaluation

Natural Language-centered Inference Network for Multi-modal Fake News Detection

MV-SHIF: Multi-view symmetric hypothesis inference fusion network for emotion-cause pair extraction in documents

FCENet: An Instance Segmentation Model for Extracting Figures and Captions From Material Documents

Cross-modal Enhancement Network for Multimodal Sentiment Analysis

Interpretable Visual Understanding with Cognitive Attention Network

Trust-Aware Evidence Reasoning and Spatiotemporal Feature Aggregation for Explainable Fake News Detection