Enhancing Molecular Property Prediction via Mixture of Collaborative Experts

Xu Yao,Shuang Liang,Songqiao Han,Hailiang Huang

2023-12-06

Abstract:Molecular Property Prediction (MPP) task involves predicting biochemical properties based on molecular features, such as molecular graph structures, contributing to the discovery of lead compounds in drug development. To address data scarcity and imbalance in MPP, some studies have adopted Graph Neural Networks (GNN) as an encoder to extract commonalities from molecular graphs. However, these approaches often use a separate predictor for each task, neglecting the shared characteristics among predictors corresponding to different tasks. In response to this limitation, we introduce the GNN-MoCE architecture. It employs the Mixture of Collaborative Experts (MoCE) as predictors, exploiting task commonalities while confronting the homogeneity issue in the expert pool and the decision dominance dilemma within the expert group. To enhance expert diversity for collaboration among all experts, the Expert-Specific Projection method is proposed to assign a unique projection perspective to each expert. To balance decision-making influence for collaboration within the expert group, the Expert-Specific Loss is presented to integrate individual expert loss into the weighted decision loss of the group for more equitable training. Benefiting from the enhancements of MoCE in expert creation, dynamic expert group formation, and experts' collaboration, our model demonstrates superior performance over traditional methods on 24 MPP datasets, especially in tasks with limited data or high imbalance.

Machine Learning,Multiagent Systems,Quantitative Methods

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to improve the performance of molecular property prediction (MPP) tasks by proposing a new architecture—GNN-MoCE (Graph Neural Network-based Mixture of Collaborative Experts). Specifically, the study addresses the following issues: 1. **Data Scarcity and Imbalance**: - In the drug development process, molecular property prediction faces issues of insufficient data or class imbalance. Traditional methods, although using Graph Neural Networks (GNN) as encoders to extract molecular features, typically set up predictors independently for each task, ignoring the shared characteristics among different tasks. 2. **Enhancing Expert Diversity**: - Traditional Mixture of Experts (MoE) structures have a homogeneity problem within the expert pool, making it difficult for experts to collaborate effectively in decision-making. This paper proposes an "expert-specific projection" method, which gives each expert a unique perspective, thereby enhancing diversity among experts. 3. **Mitigating Decision Dominance Dilemma**: - Traditional MoE structures tend to have an issue where one expert's weight becomes too dominant in the dynamic decision group, suppressing the learning of other experts. To address this, the paper introduces "expert-specific loss" to ensure that each expert can be fairly trained within the decision group, improving overall decision effectiveness. Through these methods, the GNN-MoCE architecture performs excellently on 24 MPP datasets, particularly excelling in tasks with limited data or extremely imbalanced classes.

Enhancing Molecular Property Prediction via Mixture of Collaborative Experts

EMPPNet: Enhancing Molecular Property Prediction via Cross-modal Information Flow and Hierarchical Attention

Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit Diversity Modeling

Molecular Property Prediction Based on Graph Structure Learning

GMPP-NN: a deep learning architecture for graph molecular property prediction

Understanding the Limitations of Deep Models for Molecular Property Prediction: Insights and Solutions.

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

AEGNN-M:A 3D Graph-Spatial Co-Representation Model for Molecular Property Prediction

MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts

LSTM-GNN: A Multi-Channel Model for Molecular Properties Prediction

HMoE: Heterogeneous Mixture of Experts for Language Modeling

DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts

Multi-View Graph Neural Networks for Molecular Property Prediction

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

Evidential meta-model for molecular property prediction

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

Cross-dependent graph neural networks for molecular property prediction

Enhancing property and activity prediction and interpretation using multiple molecular graph representations with MMGX

Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts