Abstract:Abstract Computational methods have been widely applied to resolve various core issues in drug discovery, such as molecular property prediction. In recent years, a data-driven computational method-deep learning had achieved a number of impressive successes in various domains. In drug discovery, graph neural networks (GNNs) take molecular graph data as input and learn graph-level representations in non-Euclidean space. An enormous amount of well-performed GNNs have been proposed for molecular graph learning. Meanwhile, efficient use of molecular data during training process, however, has not been paid enough attention. Curriculum learning (CL) is proposed as a training strategy by rearranging training queue based on calculated samples' difficulties, yet the effectiveness of CL method has not been determined in molecular graph learning. In this study, inspired by chemical domain knowledge and task prior information, we proposed a novel CL-based training strategy to improve the training efficiency of molecular graph learning, called CurrMG. Consisting of a difficulty measurer and a training scheduler, CurrMG is designed as a plug-and-play module, which is model-independent and easy-to-use on molecular data. Extensive experiments demonstrated that molecular graph learning models could benefit from CurrMG and gain noticeable improvement on five GNN models and eight molecular property prediction tasks (overall improvement is 4.08%). We further observed CurrMG’s encouraging potential in resource-constrained molecular property prediction. These results indicate that CurrMG can be used as a reliable and efficient training strategy for molecular graph learning. Availability: The source code is available in https://github.com/gu-yaowen/CurrMG.

Towards Effective and Generalizable Fine-tuning for Pre-trained Molecular Graph Models

ComABAN: refining molecular representation with the graph attention mechanism to accelerate drug discovery

Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules

Fragment-based Pretraining and Finetuning on Molecular Graphs

Pretraining Graph Transformer for Molecular Representation with Fusion of Multimodal Information

Enhancing molecular property prediction with auxiliary learning and task-specific adaptation

Leveraging 2D molecular graph pretraining for improved 3D conformer generation with graph neural networks

An effective self-supervised framework for learning expressive molecular global representations to drug discovery

Learn molecular representations from large-scale unlabeled molecules for drug discovery

On the Scalability of GNNs for Molecular Graphs

A knowledge-guided pre-training framework for improving molecular representation learning

FineMolTex: Towards Fine-grained Molecular Graph-Text Pre-training

Dual-view Molecular Pre-training

Does GNN Pretraining Help Molecular Representation?

MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular Representation Learning

Improving chemical reaction yield prediction using pre-trained graph neural networks

Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators

Describe Molecules by a Heterogeneous Graph Neural Network with Transformer-like Attention for Supervised Property Predictions

An efficient curriculum learning-based strategy for molecular graph learning

Molecular Graph Enhanced Transformer for Retrosynthesis Prediction