In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs

Grzegorz Kaszuba,Amirhossein D. Naghdi,Dario Massa,Stefanos Papanikolaou,Andrzej Jaszkiewicz,Piotr Sankowski
2024-06-04
Abstract:Large language models manifest the ability of few-shot adaptation to a sequence of provided examples. This behavior, known as in-context learning, allows for performing nontrivial machine learning tasks during inference only. In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties? However, this would not be possible for structure property prediction tasks unless an effective method is found to pass atomic-level geometric features to the transformer model. To address this problem, we employ a compound model in which GPT-2 acts on the output of geometry-aware graph neural networks to adapt in-context information. To demonstrate our model's capabilities, we partition the QM9 dataset into sequences of molecules that share a common substructure and use them for in-context learning. This approach significantly improves the performance of the model on out-of-distribution examples, surpassing the one of general graph neural network models.
Machine Learning,Materials Science
What problem does this paper attempt to address?
The problem addressed in this paper is how to utilize the "In-Context Learning" capability of the Transformer model to predict the out-of-distribution (OOD) physical properties of molecules, especially based on atomic-level geometric features. Currently, deep learning models often perform poorly when dealing with unseen compounds, which poses a bottleneck for the discovery of new drugs and materials. The researchers propose a composite model that combines GPT-2 with a geometric perception graph neural network to adapt to contextual information and predict the physical properties of molecules with similar subgraphs. Through experiments on the QM9 dataset, they demonstrate that this approach can significantly improve the model's predictive performance on OOD examples, surpassing conventional graph neural network models. The paper also discusses related work and proposes a strategy for in-context learning using the QM9 dataset.