Polymetis:Large Language Modeling for Multiple Material Domains

Chao Huang,Huichen Xiao,Chen Chen,Chunyan Chen,Yi Zhao,Shiyu Du,Yiming Zhang,He Sha,Ruixin Gu
2024-11-14
Abstract:As the application of large language models in various fields continues to expand, materials science also ushers in opportunities for AI-driven innovation. The traditional way of relying on manual search for materials science-related information is now using artificial intelligence technology as an auxiliary tool to improve the efficiency of materials science research. To accelerate researchers' knowledge acquisition and intelligent decision-making support in materials science research, this paper proposes a large language model Polymetis model for a variety of materials fields, aiming to provide highly professional knowledge answers in the field of materials, covering energy materials, functional materials, alloy materials, physical chemistry, biology, and other material directions. The model uses a dataset of about 2 million material knowledge instructions, and in the process of building the dataset, we developed the Intelligent Extraction Large Model (IELM), which is specially used to extract and form structured knowledge from scientific texts, avoiding a large number of costs that need to be manually annotated, and improving efficiency. We inject this data into the GLM4-9B model for learning to enhance its inference capabilities in a variety of material domains. In addition, we have introduced enhanced prompt strategies to ensure that the answers to the model are more organized and comprehensive, providing efficient and comprehensive intelligent support for the diverse needs of materials science exploration, and promoting the development of material science.
Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to solve several key problems faced in materials science research: 1. **Inefficient traditional materials science information acquisition methods**: The traditional method of relying on manual search for materials - science - related information is inefficient and difficult to meet the needs of researchers to quickly acquire professional knowledge. By introducing artificial intelligence technology as an auxiliary tool, the efficiency of materials science research can be significantly improved. 2. **Limitations of existing large - language models in the field of materials science**: - **Catastrophic Forgetting**: Existing large - language models are prone to the problem of catastrophic forgetting when handling multi - domain tasks, resulting in a decline in the generalization ability of the models. - **Lack of organization and precision**: The answers of some models lack organization and precision, especially when dealing with complex and diverse materials knowledge queries. - **Insufficient support for multi - domain conversations**: Existing models have limited performance in supporting conversations across multiple materials domains and cannot meet the needs of researchers to handle more complex and diverse materials knowledge queries. 3. **High cost of dataset construction**: The traditional manual extraction method is inefficient and costly, which poses a huge challenge to materials science text mining. To solve these problems, the paper proposes a multidisciplinary materials - science large - language model named Polymetis. This model aims to provide highly specialized knowledge answers, covering multiple materials directions such as energy materials, functional materials, alloy materials, physical chemistry, biomaterials, etc. Specifically, the Polymetis model achieves these goals in the following ways: - **Developing the Intelligent Extraction of Large Models (IELM)**: It is used to automatically extract and form structured materials - domain knowledge from scientific literature, avoid the cost of a large amount of manual annotation, and improve the efficiency of dataset construction. - **Using the GLM4 - 9B model for parameter - efficient LoRA fine - tuning**: Enhance the model's reasoning ability in multiple materials domains. - **Introducing an enhanced prompting strategy**: Ensure that the model's answers are more organized and comprehensive, providing efficient and comprehensive intelligent support for materials science research. Through these methods, the Polymetis model can provide accurate and efficient professional knowledge exploration tools for materials science researchers and promote the development of materials science.