Weichen Dai,Yezeng Chen,Zijie Dai,Zhijie Huang,Yubo Liu,Yixuan Pan,Baiyang Song,Chengli Zhong,Xinhe Li,Zeyu Wang,Zhuoying Feng,Yi Zhou
Abstract:Artificial intelligence is gradually demonstrating its immense potential, and increasing attention is being given to how AI can be harnessed to advance scientific research. In this vision paper, we present our perspectives on how AI can better assist scientific inquiry and explore corresponding technical approach. We have proposed and open-sourced a large model of our KALE-LM model series, Llama3-KALE-LM-Chem-8B, which has achieved outstanding performance in tasks related to the field of chemistry. We hope that our work serves as a strong starting point, helping to realize more intelligent AI and promoting the advancement of human science and technology, as well as societal development.
Artificial Intelligence,Computational Engineering, Finance, and Science,Computation and Language
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve
This paper aims to explore how to better utilize Artificial Intelligence (AI) to assist scientific research and propose corresponding technical methods. Specifically, the paper introduces a large-scale model series named KALE-LM, particularly highlighting a chemistry-specific model, Llama3-KALE-LM-Chem-8B, to demonstrate its outstanding performance in chemistry tasks.
### Background
In recent years, the rapid development of AI technology has achieved significant accomplishments in various high-intelligence tasks, even surpassing human performance in some cases. These tasks include speech recognition, facial recognition, image recognition, games (such as Go, StarCraft, Dota2), text generation, image generation, video generation, machine translation, knowledge Q&A, debating, and solving advanced mathematical problems. Science is one of the most important fields for applying AI because it is the crown of human civilization and the cornerstone of various industries, playing a core driving role in human progress.
### Current AI Applications in Science
Currently, there are three main AI technologies for building scientific brains:
1. **Specialized models for specific problems**: By constructing specialized deep neural network models to reduce the search space, such as Google DeepMind's AlphaFold series for protein structure prediction.
2. **Deep neural networks with reasoning engines**: Combining deep neural networks with reasoning engines to provide new perspectives to enhance thinking and decision-making abilities, such as AlphaGeometry and FunSearch.
3. **Large-scale model-based approaches**: Utilizing large-scale models for different forms of interaction, such as ChemCrow and Med-PaLM2 in the fields of chemistry and medicine.
### Existing Problems
Despite some progress, these technologies still cannot effectively integrate scientific knowledge and logic into AI models. Therefore, current AI cannot learn, understand, or apply the scientific principles and logical reasoning accumulated by the greatest scientists in history. Embedding knowledge and logic is one of the key challenges in developing a scientific brain.
### Vision of the Scientific Brain
Large-scale models are significant advancements in the AI field, capable of exhibiting human-like "emergent" general intelligence, learning knowledge across multiple domains, and handling various tasks. However, to achieve AI in the scientific field, the key is to clarify the needs of scientists and then train large-scale models accordingly to develop corresponding functions. The paper summarizes several key capabilities, including information extraction, semantic parsing, knowledge Q&A, and reasoning and planning.
### Practice in the Field of Chemistry
The paper introduces Llama3-KALE-LM-Chem-8B, the first chemistry-specific KALE-LM model based on Llama3. The model training is divided into two stages: continuous pre-training and supervised fine-tuning. Evaluation results show that KALE-LM significantly outperforms other models of similar scale in chemistry tasks, especially in basic chemistry capabilities, scientific Q&A, and chemical meta-information extraction.
### Conclusion
This paper proposes four core tasks that the scientific brain needs to focus on and explores how to achieve these tasks by enhancing the knowledge and logic of large-scale models. Based on these foundations, the research team has conducted multiple explorations and attempts, achieving significant progress and results. The paper hopes that its work can promote AI research and development in the scientific field.