Automated electrosynthesis reaction mining with multimodal large language models (MLLMs)

Shi Xuan Leong,Sergio Pablo-García,Zijian Zhang,Alán Aspuru-Guzik

DOI: https://doi.org/10.26434/chemrxiv-2024-7fwxv

2024-07-11

Abstract:Leveraging the chemical data that is available in legacy formats such as publications and patents is a significant challenge for the community. Automated reaction mining offers a promising solution to unleash this knowledge into a learnable digital form and therefore help expedite materials and reaction discovery. However, existing reaction mining toolkits are limited to single input modalities (text or images) and cannot effectively integrate heterogeneous data that is scattered across different modalities including text, tables, and figures. In this work, we go beyond single input modalities and explore multimodal large language models (MLLMs) for the analysis of diverse data inputs for automated electrosynthesis reaction mining. We compiled a test dataset of 65 articles and employed it to benchmark five prominent MLLMs against two critical tasks: (i) reaction diagram parsing and (ii) resolving cross-modality data interdependencies. The frontrunner MLLM achieved ≥ 96% accuracy in both tasks, with the strategic integration of single-shot visual prompts and image pre-processing techniques. We integrate this capability into a toolkit named MERMES (Multimodal Reaction Mining pipeline for ElectroSynthesis). Our toolkit functions as an end-to-end MLLM-powered pipeline that integrates article retrieval, information extraction and multimodal analysis for streamlining and automating knowledge extraction. This work lays the groundwork for the increased utilization of MLLMs to accelerate the digitization of chemistry knowledge for data-driven research.

Chemistry

What problem does this paper attempt to address?

The paper attempts to address the problem of how to utilize Multimodal Large Language Models (MLLMs) to automatically extract key information from electrochemical synthesis reactions. Specifically, the main challenges faced by the researchers include: 1. **Digitization of Chemical Knowledge**: A large amount of existing chemical knowledge is in traditional formats (such as HTML and PDF files in publications and patents), which are difficult to use directly for data-driven research. Automated reaction mining can transform this knowledge into a learnable digital form, thereby accelerating the discovery of materials and reactions. 2. **Limitations of Unimodal Tools**: Existing reaction mining tools mainly rely on a single input mode (text or image) and cannot effectively integrate heterogeneous data scattered across different modes (text, tables, and images). 3. **Complexity and Diversity of Data**: Chemical reaction conditions are usually dispersed in different parts of the literature (such as the main text, supplementary materials, tables, charts, and textual descriptions) and are often overwhelmed by a large amount of irrelevant content, making it difficult to accurately extract experimental data records. To address these issues, the researchers developed a Multimodal Large Language Model (MLLMs) driven tool named MERMES (Multimodal Reaction Mining pipeline for Electro Synthesis). This tool achieves its goal through the following two key tasks: 1. **Reaction Graph Parsing**: Extracting reaction conditions from reaction graphs and classifying them into 10 different categories (such as anode, cathode, electrolyte/additive, solvent, etc.). 2. **Cross-Modal Data Dependency Parsing**: Identifying footnote labels in charts and associating them with definitions in the text to ensure data consistency and integrity. Through these methods, MERMES can effectively integrate and process multimodal information from scientific literature, thereby achieving automated reaction mining. This lays the foundation for comprehensive digitization of chemical knowledge and data-driven research.

Automated electrosynthesis reaction mining with multimodal large language models (MLLMs)

Automated electrosynthesis reaction mining with multimodal large language models (MLLMs)

Integrating Machine Learning and Large Language Models to Advance Exploration of Electrochemical Reactions

Integrating Machine Learning and Large Language Models to Advance Wu Exploration of Electrochemical Reactions

Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation

An Automatic End-to-end Chemical Synthesis Development Platform Powered by Large Language Models

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Accelerated end-to-end chemical synthesis development with large language models

LMM Chemical Research with Document Retrieval

Fine-tuning Large Language Models for Chemical Text Mining

Reaction Miner: an Integrated System for Chemical Reaction Extraction from Textual Data

Bridging Chemical Knowledge and Machine Learning for Performance Prediction of Organic Synthesis.

Extracting Structured Data from Organic Synthesis Procedures Using a Fine-Tuned Large Language Model

An Autonomous Large Language Model Agent for Chemical Literature Data Mining

ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models

Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design

MolX: Enhancing Large Language Models for Molecular Learning with A Multi-Modal Extension

Validation of the Scientific Literature via Chemputation Augmented by Large Language Models

ReLM: Leveraging Language Models for Enhanced Chemical Reaction Prediction

Advances in machine learning with chemical language models in molecular property and reaction outcome predictions