Predicting cognitive scores with graph neural networks through sample selection learning

Martin Hanik,Mehmet Arif Demirtaş,Mohammed Amine Gharsallaoui,Islem Rekik
DOI: https://doi.org/10.1007/s11682-021-00585-7
2022-02-15
Abstract:Analyzing the relation between intelligence and neural activity is of the utmost importance in understanding the working principles of the human brain in health and disease. In existing literature, functional brain connectomes have been used successfully to predict cognitive measures such as intelligence quotient (IQ) scores in both healthy and disordered cohorts using machine learning models. However, existing methods resort to flattening the brain connectome (i.e., graph) through vectorization which overlooks its topological properties. To address this limitation and inspired from the emerging graph neural networks (GNNs), we design a novel regression GNN model (namely RegGNN) for predicting IQ scores from brain connectivity. On top of that, we introduce a novel, fully modular sample selection method to select the best samples to learn from for our target prediction task. However, since such deep learning architectures are computationally expensive to train, we further propose a \emph{learning-based sample selection} method that learns how to choose the training samples with the highest expected predictive power on unseen samples. For this, we capitalize on the fact that connectomes (i.e., their adjacency matrices) lie in the symmetric positive definite (SPD) matrix cone. Our results on full-scale and verbal IQ prediction outperforms comparison methods in autism spectrum disorder cohorts and achieves a competitive performance for neurotypical subjects using 3-fold cross-validation. Furthermore, we show that our sample selection approach generalizes to other learning-based methods, which shows its usefulness beyond our GNN architecture.
Machine Learning,Neurons and Cognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to accurately predict cognitive scores (such as IQ scores) through functional brain connectomes and overcome the limitations of ignoring the topological properties of brain connectomes in existing methods**. Specifically, existing methods usually vectorize brain connectomes (that is, flatten the graph - structured connectomes into vectors), which leads to the loss of their topological properties. To overcome this problem, the authors introduce a novel regression model based on Graph Neural Networks (called RegGNN), which can directly process the graph structure of brain connectomes, thereby retaining their topological properties. In addition, the authors also propose a new sample selection method, aiming to select the most predictive training samples to improve the prediction performance of the model and reduce the computational resources required for training. This method utilizes the geometric properties of the weighted adjacency matrix of functional brain connectomes in the Symmetric Positive Definite (SPD) cone, and measures the similarity between different connectomes by means of Riemannian geometry. ### Main contributions: 1. **Propose a new learning - based sample selection method** to improve the accuracy when predicting cognitive scores from connectomes. 2. **Introduce new brain connectome similarity metrics** that combine the concepts of Riemannian geometry and graph topology, and these metrics can be applied to other application scenarios that need to handle elements of Riemannian manifolds. 3. **Design a pipeline including RegGNN and sample selection** that outperforms existing models in predicting full - scale intelligence and verbal IQ in the Autism Spectrum Disorder (ASD) cohort and achieves competitive performance in the neurotypical cohort. ### Method overview: - **RegGNN architecture**: It includes two graph convolutional layers and one fully - connected layer and can directly process the graph structure of brain connectomes. - **Sample selection method**: Calculate the Riemannian tangent space matrices between connectomes, extract topological features, and use a linear regression model to select the most representative training samples. - **Dataset**: Use the ABIDE pre - processed dataset, which covers the functional magnetic resonance imaging data of neurotypical individuals and individuals with Autism Spectrum Disorder. ### Evaluation and comparison: - Use 3 - fold cross - validation to evaluate the generalization ability and robustness of the model. - Report the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) as evaluation metrics. - Compare with the existing state - of - the - art deep learning and machine learning methods (such as CPM and PNA). In conclusion, this paper aims to improve the accuracy of predicting cognitive scores from brain connectomes by improving the model architecture and sample selection strategy, while exploring the importance of the topological properties of brain connectomes in prediction tasks.