Towards AI Research Agents in the Chemical Sciences

Ofer Shir
DOI: https://doi.org/10.26434/chemrxiv-2024-lf2xx
2024-01-23
Abstract:An underlying problem shared by experimental scientists is to achieve optimal behavior of their systems and arrive at new discoveries while searching over an array of controls. Accordingly, every scientific discovery may be reduced to solving a Combinatorial Optimization problem upon formulating the characteristic array of decision variables. While AI systems already excel in navigating the landscape of possible experiments, we argue herein that they will be able to drive the entire process of scientific experimental research. Especially, in response to the so-called Nobel Turing Challenge, regarding which Kitano envisioned AI Scientists, the goal of this paper is to provide a pragmatic roadmap to obtain AI Research Agents in the Chemical Sciences. We begin by reviewing the existing integration of Computational Intelligence into experimental systems, which already benefit from solving discovery/optimization problems. We mention recent discoveries in the domains of Enzymes' Design, Material Science, Quantum Mechanics, and Postharvest, in which AI systems played active roles in attaining some ground-breaking results -- thanks to being conception-free and unbiased by flawed intuition. We then devise a concrete work plan to train agents to formulate hypotheses by Deep Symbolic Reinforcement Learning, using knowledge representations based on processed scientific textbooks. We focus on the Chemical Sciences, which possess stationary Knowledge Graphs, and propose how to obtain an independent AI system at the graduate student level for ``core Chemistry''.
Chemistry
What problem does this paper attempt to address?
This paper discusses how to apply artificial intelligence (AI) to chemical scientific research to achieve the goals of AI research agents. The author points out that every scientific discovery can be attributed to combinatorial optimization problems, and AI has shown excellent performance in experimental design and optimization. The paper proposes a practical roadmap to train AI agents to form hypotheses and utilize deep symbolic reinforcement learning with knowledge representations based on scientific textbooks. The research focuses on the field of chemical science because it has a static knowledge graph, and suggests creating a graduate-level independent AI system specifically for "core chemistry". The paper first reviews the applications of AI in enzyme design, materials science, quantum mechanics, and post-harvest processing. Then, a specific plan is proposed, which involves having AI generate hypotheses from knowledge graphs and guide experimental verification of these hypotheses using existing algorithms. In addition, the paper discusses the increasing role of AI in scientific experiments, as the role of human scientists transitions from seeking solutions to interpreting the mechanisms of results. The paper also mentions AI-driven scientific discoveries, such as enzyme design, new material synthesis, quantum mechanics experiments, and new protocols for post-harvest processing. In these examples, AI systems are not influenced by prior knowledge and human intuition, enabling innovative discoveries. Finally, the paper discusses the feasibility, interpretability, and creativity of implementing AI research agents. Although it requires a significant amount of effort, the author believes it is a feasible roadmap. The advantage of AI in scientific exploration lies in its freedom from traditional thinking constraints, daring to try potentially unsuccessful approaches, and thus potentially breaking existing scientific boundaries.