ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback

Henry W. Sprueill,Carl Edwards,Khushbu Agarwal,Mariefel V. Olarte,Udishnu Sanyal,Conrad Johnston,Hongbin Liu,Heng Ji,Sutanay Choudhury
2024-06-08
Abstract:The discovery of new catalysts is essential for the design of new and more efficient chemical processes in order to transition to a sustainable future. We introduce an AI-guided computational screening framework unifying linguistic reasoning with quantum-chemistry based feedback from 3D atomistic representations. Our approach formulates catalyst discovery as an uncertain environment where an agent actively searches for highly effective catalysts via the iterative combination of large language model (LLM)-derived hypotheses and atomistic graph neural network (GNN)-derived feedback. Identified catalysts in intermediate search steps undergo structural evaluation based on spatial orientation, reaction pathways, and stability. Scoring functions based on adsorption energies and reaction energy barriers steer the exploration in the LLM's knowledge space toward energetically favorable, high-efficiency catalysts. We introduce planning methods that automatically guide the exploration without human input, providing competitive performance against expert-enumerated chemical descriptor-based implementations. By integrating language-guided reasoning with computational chemistry feedback, our work pioneers AI-accelerated, trustworthy catalyst discovery.
Chemical Physics,Artificial Intelligence,Computational Engineering, Finance, and Science,Machine Learning
What problem does this paper attempt to address?
This paper proposes a solution to the problem of discovering new catalysts. Currently, the discovery of catalysts relies on the combination of chemical descriptors, but the understanding of these descriptors is only based on experience, which poses a challenge for computer-aided research on catalysis. The paper introduces an AI-assisted computational screening framework that combines semantic reasoning of language models and a 3D atomic graph neural network based on quantum chemistry feedback. It treats catalyst discovery as an uncertain environment, in which the agent (LLM) actively searches for efficient catalysts by iteratively combining hypotheses proposed by a large-scale language model (LLM) and feedback provided by the graph neural network. This method considers factors such as structure evaluation, reaction pathways, and stability, using adsorption energy and reaction energy barriers as scoring functions to guide exploration. By using automated planning methods, exploration can be guided without manual input, and it shows competitiveness compared to expert-enumerated chemical descriptors. The paper also emphasizes the combination of language-guided reasoning and computational chemistry feedback to accelerate reliable catalyst discovery.