Abstract:Rapid development of artificial intelligence has drastically accelerated the development of scientific discovery. Trained with large-scale observation data, deep neural networks extract the underlying patterns in an end-to-end manner and assist human researchers with highly-precised predictions in unseen scenarios. The recent rise of Large Language Models (LLMs) and the empowered autonomous agents enable scientists to gain help through interaction in different stages of their research, including but not limited to literature review, research ideation, idea implementation, and academic writing. However, AI researchers instantiated by foundation model empowered agents with full-process autonomy are still in their infancy. In this paper, we study $\textbf{AI-Generated Science}$ (AIGS), where agents independently and autonomously complete the entire research process and discover scientific laws. By revisiting the definition of scientific research, we argue that $\textit{falsification}$ is the essence of both human research process and the design of an AIGS system. Through the lens of falsification, prior systems attempting towards AI-Generated Science either lack the part in their design, or rely heavily on existing verification engines that narrow the use in specialized domains. In this work, we propose Baby-AIGS as a baby-step demonstration of a full-process AIGS system, which is a multi-agent system with agents in roles representing key research process. By introducing FalsificationAgent, which identify and then verify possible scientific discoveries, we empower the system with explicit falsification. Experiments on three tasks preliminarily show that Baby-AIGS could produce meaningful scientific discoveries, though not on par with experienced human researchers. Finally, we discuss on the limitations of current Baby-AIGS, actionable insights, and related ethical issues in detail.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to construct an artificial intelligence system that can independently complete the entire scientific research process, namely AI - Generated Science (AIGS). Specifically, the authors focus on how to enhance the system's autonomous falsification ability by introducing the **FALSIFICATION AGENT**, which is a core part of the scientific research process. Traditional scientific research follows the process of proposing hypotheses, experimental verification, and falsifying or confirming hypotheses. However, this process is either missing or depends on domain - specific verification engines in existing AIGS systems, limiting their application scope. Therefore, the **BABY - AIGS** system proposed in this paper aims to simulate the research process of human scientists through multi - agent collaboration, especially in the falsification stage, in order to achieve more comprehensive scientific research automation. ### Key Point Summary: 1. **Problem Definition**: How to build an AI system that can independently complete the entire scientific research process from hypothesis generation to falsification verification. 2. **Core Challenge**: Existing AIGS systems lack an effective falsification mechanism or rely on domain - specific verification tools, limiting the system's universality and autonomy. 3. **Solution**: Introduce the **FALSIFICATION AGENT** and realize the full - process automation from hypothesis generation to falsification verification through multi - agent collaboration. 4. **Objective**: Improve the credibility and scientific nature of AI - generated scientific discoveries through the falsification process. ### Research Background: - **Application of AI in Scientific Research**: Deep learning and large - language models (LLMs) have significantly accelerated the progress of scientific research, from the optimization of specific tasks to the role of research assistants, and then to attempts at fully automated scientific research. - **The Nature of Scientific Research**: According to Popper (1935), the core of scientific research lies in falsification, that is, verifying or refuting hypotheses by designing and executing experiments. ### **BABY - AIGS System Design**: - **Design Principles**: The system design is based on three core principles: falsification, creativity, and executability. - **System Architecture**: - **Pre - Falsiﬁcation Stage**: It includes stages such as hypothesis generation, method design, experimental execution, and result analysis, and gradually optimizes hypotheses and methods through multiple rounds of iteration. - **Falsiﬁcation Stage**: Identify key factors through the **FALSIFICATION AGENT** to form hypotheses, and verify them through ablation experiments, and finally generate scientific discoveries. ### Experimental Results: - **Preliminary Experiments**: Experiments were carried out on three tasks: data engineering, self - instruction alignment, and language modeling. The results show that **BABY - AIGS** can independently generate meaningful scientific discoveries, although there is still a gap compared with experienced researchers. - **Performance Improvement**: A consistent performance improvement was observed during the method iteration process. ### Discussion and Outlook: - **Limitations**: The current system is still inferior in performance to the results of top - level academic conferences and needs further improvement. - **Ethical Issues**: The potential negative impacts of AIGS systems and strategies for responsible development are discussed. Through these designs and experiments, the authors demonstrate the preliminary results of constructing an AI system that can independently complete the entire scientific research process, especially by introducing a falsification mechanism, which improves the credibility and scientific nature of scientific discoveries.

AIGS: Generating Science from AI-Powered Automated Falsification

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Fake Artificial Intelligence Generated Contents (FAIGC): A Survey of Theories, Detection Methods, and Opportunities

RealtimeGen: an Intervenable AI Image Generation System for Commercial Digital Art Asset Creators

AI-Generated Content (AIGC): A Survey

Staying vigilant in the Age of AI: From content generation to content authentication

The role of artificial intelligence in generating original scientific research

The Evolution and Future Perspectives of Artificial Intelligence Generated Content

Position: Stop Making Unscientific AGI Performance Claims

Detection of ChatGPT Fake Science with the xFakeSci Learning Algorithm

A new solution and concrete implementation steps for Artificial General Intelligence

Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened

Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis Agents

AI vs. Human -- Differentiation Analysis of Scientific Content Generation

Autonomous LLM-driven research from data to human-verifiable research papers

Generative Artificial Intelligence Reproducibility and Consensus

Collaboration with Generative Artificial Intelligence: An Exploratory Study Based on Learning Analytics

A Rapid Investigation of Artificial Intelligence Generated Content Footprints in Scholarly Publications

Scientific discovery in the age of artificial intelligence

"Turing Tests" For An AI Scientist

Evolution and future directions of Artificial Intelligence Generated Content (AIGC): A comprehensive review