LMM Chemical Research with Document Retrieval

Kevin Kawchak

DOI: https://doi.org/10.26434/chemrxiv-2024-p91gm

2024-08-13

Abstract:Chemical research is more effectively progressed using Large Multimodal Models (LMMs) combined with Document Retrieval and recently published literature. The methods described here illustrate significant strides over previously tested Large Language Model (LLM) multi-document workflows for characterization assistance and generating new reactions. Here, 3.5 Sonnet, ScholarGPT, and ChatGPT 4o LMMs processed either 5 images or 5 supplementary documents from leading 2024 journals. Each of the three models performed inference on a detailed prompt to produce a response that included context from attachments. In addition, the LMMs were not provided with which of the 5 files contained the answer. The main findings were that 3.5 Sonnet had an average score of 9.8 for images, while two judges awarded high scores to ChatGPT 4o (9.7, 9.4) and ScholarGPT (9.5, 9.4) for document analysis. Judging was performed by a human evaluator for the image uploads, with document processing evaluated by Llama 3.1 405B and Nemotron 4 340B LLMs which correlated well and improved explainability. Highlights include 3.5 Sonnet's ability to interpret a Two-dimensional Nuclear Magnetic Resonance (2D NMR) spectrum accurately, along with Judge Llama 3.1's ability to provide consistent formatted scores with explanations. The results shown here help illustrate AI's continued revitalization of the established chemical research field.

Chemistry

What problem does this paper attempt to address?

The paper primarily explores the application of Large Multimodal Models (LMMs) combined with document retrieval technology in chemical research. The goal of the study is to evaluate the effectiveness and accuracy of LMMs in handling chemical images and documents, particularly in assisting chemical characterization and generating new reactions. Specifically, the paper addresses its research objectives through the following points: 1. **Comparing the performance of different LMMs**: By having three different LMMs (3.5 Sonnet, ScholarGPT, and ChatGPT 4o) process five images or five supplementary documents selected from top chemical journals, the study assesses the performance of these models in solving chemical problems. 2. **Refining evaluation criteria**: To ensure the accuracy of the evaluation, the researchers developed detailed evaluation criteria, including whether the correct contextual information was provided, whether the correct image source was identified, and the accuracy of the generated results. 3. **Improving interpretability**: By using workflows such as Retrieval Augmented Generation (RAG) and Direct Document Retrieval (DR), the interpretability of the model outputs was enhanced. 4. **Quantifying performance**: The performance of different models was measured using quantitative scoring methods. For example, 3.5 Sonnet received a high average score of 9.8 in image analysis, while ChatGPT 4o and ScholarGPT also achieved near-perfect scores in document analysis. In summary, this paper aims to experimentally verify the potential of LMMs in the field of chemical research and reveal how these models can effectively assist chemists in performing complex analytical tasks.

LMM Chemical Research with Document Retrieval

High Dimensional and Complex Spectrometric Data Analysis of an Organic Compound using Large Multimodal Models and Chained Outputs

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models

Fine-tuning Large Language Models for Chemical Text Mining

LMM Spectrometric Determination of an Organic Compound

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Automated electrosynthesis reaction mining with multimodal large language models (MLLMs)

SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration

Accelerated end-to-end chemical synthesis development with large language models

Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

BatGPT-Chem: A Foundation Large Model For Chemical Engineering

Large Language Models as Evaluators for Scientific Synthesis

Are large language models superhuman chemists?

Augmenting large language models with chemistry tools

An Automatic End-to-end Chemical Synthesis Development Platform Powered by Large Language Models

Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation

From Words to Molecules: A Survey of Large Language Models in Chemistry

Structured Chemistry Reasoning with Large Language Models