Abstract:Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making. Automatic chart understanding has witnessed significant advancements with the rise of large foundation models in recent years. Foundation models, such as large language models, have revolutionized various natural language processing tasks and are increasingly being applied to chart understanding tasks. This survey paper provides a comprehensive overview of the recent developments, challenges, and future directions in chart understanding within the context of these foundation models. We review fundamental building blocks crucial for studying chart understanding tasks. Additionally, we explore various tasks and their evaluation metrics and sources of both charts and textual inputs. Various modeling strategies are then examined, encompassing both classification-based and generation-based approaches, along with tool augmentation techniques that enhance chart understanding performance. Furthermore, we discuss the state-of-the-art performance of each task and discuss how we can improve the performance. Challenges and future directions are addressed, highlighting the importance of several topics, such as domain-specific charts, lack of efforts in developing evaluation metrics, and agent-oriented settings. This survey paper serves as a comprehensive resource for researchers and practitioners in the fields of natural language processing, computer vision, and data analysis, providing valuable insights and directions for future research in chart understanding leveraging large foundation models. The studies mentioned in this paper, along with emerging new research, will be continually updated at:

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to provide a comprehensive overview of the latest advancements, challenges, and future directions in the field of automatic chart understanding, particularly in the context of the rise of large foundational models (such as large-scale language models). Specifically: 1. **Background and Challenges**: - Charts play a crucial role in data visualization, transforming complex data into intuitive information to aid decision-making. - There have been significant advancements in automatic chart understanding technology in recent years, especially with the development of large foundational models. - Traditional methods have limitations in domain transfer and reasoning capabilities, while new large-scale visual language models have brought breakthrough progress. 2. **Research Scope**: - Introduces the basic building blocks of automatic chart understanding, including visual encoders, Optical Character Recognition (OCR) modules, text decoders, etc. - Analyzes various chart understanding tasks and their evaluation metrics, such as chart question answering, chart description generation, chart-to-table conversion, chart fact-checking, etc. - Discusses the sources and diversity of different types of chart datasets and analyzes the characteristics and limitations of these datasets. 3. **Current Status and Improvements**: - Provides an overview of the state-of-the-art performance in various tasks and explores how to further improve model performance. - Highlights existing challenges, such as understanding domain-specific charts, the lack of effective evaluation metrics, and the need for adversarial settings. 4. **Future Directions**: - Proposes key areas for future research, such as developing models to handle complex charts, improving the evaluation metric system, and diversifying datasets. - Aims to promote further development in the intersection of data visualization and machine learning. In summary, this paper is dedicated to providing a detailed resource for researchers in the fields of natural language processing, computer vision, and data analysis, aiding future research in chart understanding.

From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

Chart Understanding with Large Language Model

An Intelligent Approach to Automatically Discovering Visual Insights

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

ChartThinker: A Contextual Chain-of-Thought Approach to Optimized Chart Summarization

A Survey and Approach to Chart Classification

StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding

StructChart: Perception, Structuring, Reasoning for Visual Chart Understanding

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

ChartBench: A Benchmark for Complex Visual Reasoning in Charts

ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language

ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering

EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding

CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models

TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

Enhancing Question Answering on Charts Through Effective Pre-training Tasks

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

On Pre-training of Multimodal Language Models Customized for Chart Understanding

Transformers Utilization in Chart Understanding: A Review of Recent Advances & Future Trends

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization