From Detection to Application: Recent Advances in Understanding Scientific Tables and Figures

Jiani Huang,Haihua Chen,Fengchang Yu,Wei Lu
DOI: https://doi.org/10.1145/3657285
IF: 16.6
2024-04-12
ACM Computing Surveys
Abstract:Tables and figures are usually used to present information in a structured and visual way in scientific documents. Understanding the tables and figures in scientific documents is significant for a series of downstream tasks, such as academic search, scientific knowledge graphs, and so on. Existing studies mainly focus on detecting figures and tables from scientific documents, interpreting their semantics, and integrating them into downstream tasks. However, a systematic and comprehensive literature review on the mining and application of tables and figures in academic papers is still missing. In this article, we introduce the research framework and the whole pipeline for understanding tables and figures, including detection, structural analysis, interpretation, and application. We deliver a thorough analysis of benchmark datasets, recent techniques, and their pros and cons. Additionally, a quantitative analysis of the effectiveness of different models on popular benchmarks is presented. We further outline several important applications that exploit the semantics of scientific tables and figures. Finally, we highlight the challenges and some potential directions for future research. We believe this is the first comprehensive survey in understanding scientific tables and figures that covers the landscape from detection to application.
computer science, theory & methods
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the issues of understanding and applying tables and charts in academic literature. Specifically, the authors point out that although existing research mainly focuses on detecting charts from academic literature, interpreting their semantics, and integrating them into downstream tasks, there is still a lack of a comprehensive systematic literature review on the mining and application of tables and charts in academic papers. Therefore, the goals of this paper are: 1. **Define the Research Framework**: Introduce the research framework for understanding tables and charts and its entire process, including detection, structural analysis, interpretation, and application. 2. **Benchmark Datasets and Methods**: Provide a detailed analysis of existing benchmark datasets, the latest technologies, and their advantages and disadvantages. 3. **Quantitative Analysis**: Demonstrate the effectiveness of different models on popular benchmark datasets. 4. **Important Applications**: Outline the important applications of utilizing the semantics of tables and charts in various downstream tasks. 5. **Challenges and Future Directions**: Highlight the challenges in current research and propose some potential future research directions. Through these objectives, the paper hopes to fill the gap in the systematic review of the understanding and application of tables and charts in academic literature, providing guidance and support for future research.