Abstract:We present MatSci-NLP, a natural language benchmark for evaluating the performance of natural language processing (NLP) models on materials science text. We construct the benchmark from publicly available materials science text data to encompass seven different NLP tasks, including conventional NLP tasks like named entity recognition and relation classification, as well as NLP tasks specific to materials science, such as synthesis action retrieval which relates to creating synthesis procedures for materials. We study various BERT-based models pretrained on different scientific text corpora on MatSci-NLP to understand the impact of pretraining strategies on understanding materials science text. Given the scarcity of high-quality annotated data in the materials science domain, we perform our fine-tuning experiments with limited training data to encourage the generalize across MatSci-NLP tasks. Our experiments in this low-resource training setting show that language models pretrained on scientific text outperform BERT trained on general text. MatBERT, a model pretrained specifically on materials science journals, generally performs best for most tasks. Moreover, we propose a unified text-to-schema for multitask learning on \benchmark and compare its performance with traditional fine-tuning methods. In our analysis of different training methods, we find that our proposed text-to-schema methods inspired by question-answering consistently outperform single and multitask NLP fine-tuning methods. The code and datasets are publicly available at \url{<a class="link-external link-https" href="https://github.com/BangLab-UdeM-Mila/NLP4MatSci-ACL23" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on natural language processing (NLP) tasks in the field of materials science. Specifically: 1. **Develop and evaluate NLP models applicable to materials science texts**: Materials science research involves a large amount of text data, such as journal articles, patents, and technical reports. These text data contain rich knowledge, but currently there is a lack of effective tools to process and understand these texts. Therefore, this research aims to develop an NLP benchmarking platform (MatSci - NLP) specifically for materials science texts to evaluate the performance of different NLP models in materials science tasks. 2. **Explore the impact of pre - training strategies on the performance of downstream tasks**: Due to the scarcity of high - quality labeled data in the field of materials science, researchers hope to understand how different pre - training strategies (for example, pre - training on general texts or domain - specific texts) affect the performance of models on materials science tasks. In particular, the research focuses on whether language models dedicated to the field of materials science (such as MatBERT) are more effective than general - purpose language models (such as BERT). 3. **Propose and validate new multi - task learning methods**: In order to improve the learning efficiency of models in low - resource environments, the research proposes a text - to - schema - based multi - task learning method and compares it with traditional single - task and multi - task fine - tuning methods. The research shows that this new method can significantly improve model performance on multiple tasks. ### Specific problem decomposition - **Q1: What is the impact of in - domain pre - training on the downstream performance of language models on MatSci - NLP tasks?** - The research finds that pre - training models dedicated to the field of materials science (such as MatBERT) usually perform best on most tasks, followed by SciBERT. This indicates that in - domain pre - training helps models acquire knowledge in relevant fields. - **Q2: How do contextual data patterns and multi - task learning affect the learning efficiency in low - resource training environments?** - The experimental results show that the question - answering - inspired text - to - schema method (Task - Schema) performs best on most of all models and is superior to single - task and multi - task fine - tuning settings. Through the research of these problems, the author hopes to promote the development of NLP tools in the field of materials science, thereby accelerating the discovery, synthesis, and application of new materials.

MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling

MatSciBERT: A materials domain language model for text mining and information extraction

HoneyBee: Progressive Instruction Finetuning of Large Language Models for Materials Science

MatSciML: A Broad, Multi-Task Benchmark for Solid-State Materials Modeling

MatText: Do Language Models Need More than Text & Scale for Materials Modeling?

Mining experimental data from Materials Science literature with Large Language Models: an evaluation study

MSciNLI: A Diverse Benchmark for Scientific Natural Language Inference

Fine-Tuning Language Models for Scientific Writing Support

LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Probing the limitations of multimodal language models for chemistry and materials research

Towards Foundation Models for Materials Science: The Open MatSci ML Toolkit

Fine-Tuning Large Language Models for Scientific Text Classification: A Comparative Study

SciBERT: A Pretrained Language Model for Scientific Text

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models

MaScQA: Investigating Materials Science Knowledge of Large Language Models

NLP meets Materials Science: Quantifying the presentation of materials data in scientific literature

MaterialBENCH: Evaluating College-Level Materials Science Problem-Solving Abilities of Large Language Models

OpticalBERT and OpticalTable-SQA: Text- and Table-Based Language Models for the Optical-Materials Domain

Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding