Could Chemical LLMs benefit from Message Passing

Jiaqing Xie,Ziheng Chi
2024-08-26
Abstract:Pretrained language models (LMs) showcase significant capabilities in processing molecular text, while concurrently, message passing neural networks (MPNNs) demonstrate resilience and versatility in the domain of molecular science. Despite these advancements, we find there are limited studies investigating the bidirectional interactions between molecular structures and their corresponding textual representations. Therefore, in this paper, we propose two strategies to evaluate whether an information integration can enhance the performance: contrast learning, which involves utilizing an MPNN to supervise the training of the LM, and fusion, which exploits information from both models. Our empirical analysis reveals that the integration approaches exhibit superior performance compared to baselines when applied to smaller molecular graphs, while these integration approaches do not yield performance enhancements on large scale graphs.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the following issues: 1. **The relationship between molecular structure and text representation**: Although pre-trained language models (LM) have shown significant capabilities in handling molecular text, and message passing neural networks (MPNN) have demonstrated robustness and versatility in the field of molecular science, there is little research on how to combine the two to enhance performance. 2. **Effectiveness of information integration methods**: Two strategies are proposed to evaluate whether information integration can improve performance: contrastive learning (using MPNN to supervise LM training) and fusion (combining the outputs of both). The study finds that these methods perform better than baseline models on small-scale molecular graphs but are less effective on large-scale graphs. 3. **Impact of different dataset partitioning strategies**: The experiments analyze the impact of different dataset partitioning strategies and random seeds on overall performance. In summary, the paper aims to explore how chemical language models can benefit from message passing mechanisms and experimentally verify the effectiveness of different integration methods.