LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Pingchuan Ma,Tsun-Hsuan Wang,Minghao Guo,Zhiqing Sun,Joshua B. Tenenbaum,Daniela Rus,Chuang Gan,Wojciech Matusik

2024-05-16

Abstract:Large Language Models have recently gained significant attention in scientific discovery for their extensive knowledge and advanced reasoning capabilities. However, they encounter challenges in effectively simulating observational feedback and grounding it with language to propel advancements in physical scientific discovery. Conversely, human scientists undertake scientific discovery by formulating hypotheses, conducting experiments, and revising theories through observational analysis. Inspired by this, we propose to enhance the knowledge-driven, abstract reasoning abilities of LLMs with the computational strength of simulations. We introduce Scientific Generative Agent (SGA), a bilevel optimization framework: LLMs act as knowledgeable and versatile thinkers, proposing scientific hypotheses and reason about discrete components, such as physics equations or molecule structures; meanwhile, simulations function as experimental platforms, providing observational feedback and optimizing via differentiability for continuous parts, such as physical parameters. We conduct extensive experiments to demonstrate our framework's efficacy in constitutive law discovery and molecular design, unveiling novel solutions that differ from conventional human expectations yet remain coherent upon analysis.

Machine Learning,Artificial Intelligence,Computational Engineering, Finance, and Science

What problem does this paper attempt to address?

The paper aims to address the issues of automation and acceleration in physical science discovery. Specifically, it focuses on the limitations of large language models (LLMs) in scientific discovery, particularly the challenges in effectively simulating observational feedback and integrating it with language to advance physical science discovery. To tackle these issues, the authors propose a dual-layer optimization framework called the "Scientific Generative Agent" (SGA). The core idea of this framework is to combine the knowledge-driven abstract reasoning capabilities of LLMs with the power of simulation computation. In the outer layer of optimization, LLMs act as knowledgeable and versatile thinkers, proposing scientific hypotheses and reasoning about discrete components such as physical equations or molecular structures. In the inner layer of optimization, simulations serve as experimental platforms, providing observational feedback and optimizing continuous parts (e.g., physical parameters) through differentiable methods. This approach not only enhances the efficiency of discovering physical laws and molecular design but also uncovers solutions that, while beyond conventional human expectations, are still reasonable upon analysis. Additionally, the paper explores how adjusting the generation temperature of LLMs can achieve a balance between exploration and exploitation, thereby further improving the model's performance.

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Optimal Decision Making Through Scenario Simulations Using Large Language Models

LLM experiments with simulation: Large Language Model Multi-Agent System for Simulation Model Parametrization in Digital Twins

Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations

LLMs are Highly-Constrained Biophysical Sequence Optimizers

Enhancing LLMs for Power System Simulations: A Feedback-driven Multi-agent Framework

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

Logic-Enhanced Language Model Agents for Trustworthy Social Simulations

Synergistic Simulations: Multi-Agent Problem Solving with Large Language Models

LLM-Augmented Agent-Based Modelling for Social Simulations: Challenges and Opportunities

Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

Solving General Natural-Language-Description Optimization Problems with Large Language Models

Synergizing Human Expertise and AI Efficiency with Language Model for Microscopy Operation and Automated Experiment Design

Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation

User Behavior Simulation with Large Language Model based Agents

Enabling Large Language Models to Perform Power System Simulations with Previously Unseen Tools: A Case of Daline

From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems

Sense and Sensitivity: Evaluating the simulation of social dynamics via Large Language Models

LLM4ED: Large Language Models for Automatic Equation Discovery

LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis

MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration