Abstract:Using machine-learning tools to predict individual phenotypes from neuroimaging data is one of the most promising and hence dynamic fields in systems neuroscience. Here, we perform a literature survey of the rapidly work on phenotype prediction in healthy subjects or general population to sketch out the current state and ongoing developments in terms of data, analysis methods and reporting. Excluding papers on age-prediction and clinical applications, which form a distinct literature, we identified a total 108 papers published since 2007. In these, memory, fluid intelligence and attention were most common phenotypes to be predicted, which resonates with the observation that roughly a quarter of the papers used data from the Human Connectome Project, even though another half recruited their own cohort. Sample size (in terms of training and external test sets) and prediction accuracy (from internal and external validation respectively) did not show significant temporal trends. Prediction accuracy was negatively correlated with sample size of the training set, but not the external test set. While known to be optimistic, leave-one-out cross-validation (LOO CV) was the prevalent strategy for model validation (n = 48). Meanwhile, 27 studies used external validation with external test set. Both numbers showed no significant temporal trends. The most popular learning algorithm was connectome-based predictive modeling introduced by the Yale team. Other common learning algorithms were linear regression, relevance vector regression (RVR), support vector regression (SVR), least absolute shrinkage and selection operator (LASSO), and elastic net. Meanwhile, the amount of data from self-recruiting studies (but not studies using open, shared dataset) was positively correlated with internal validation prediction accuracy. At the same time, self-recruiting studies also reported a significantly higher internal validation prediction accuracy than those using open, shared datasets. Data type and participant age did not significantly influence prediction accuracy. Confound control also did not influence prediction accuracy after adjusted for other factors. To conclude, most of the current literature is probably quite optimistic with internal validation using LOO CV. More efforts should be made to encourage the use of external validation with external test sets to further improve generalizability of the models.

Power and reproducibility in the external validation of brain-phenotype predictions

Brain-phenotype predictions can survive across diverse real-world data

Brain-phenotype predictions of language and executive function can survive across diverse real-world data: Dataset shifts in developmental populations

Small effect size leads to reproducibility failure in resting-state fMRI studies

Leveraging the adolescent brain cognitive development study to improve behavioral prediction from neuroimaging in smaller replication samples

Replicable brain-phenotype associations require large-scale neuroimaging data

External validation of machine learning models - registered models and adaptive sample splitting

Reproducible brain-wide association studies require thousands of individuals

Study design features increase replicability in brain-wide association studies

Study design features increase replicability in cross-sectional and longitudinal brain-wide association studies

Bias in data-driven estimates of the reproducibility of univariate brain-wide association studies.

Reporting details of neuroimaging studies on individual traits prediction: A literature survey

Generalizable and replicable brain-based predictions of cognitive functioning across common psychiatric illness

Dataset factors influencing age-related changes in brain structure and function in neurodevelopmental conditions

Limited generalizability of multivariate brain-based dimensions of child psychiatric symptoms

Benchmarking the generalizability of brain age models: Challenges posed by scanner variance and prediction bias

Longitudinally stable, brain-based predictive models mediate the relationships between childhood cognition and socio-demographic, psychological and genetic factors

Replicability and generalizability in population psychiatric neuroimaging

Dataset factors associated with age-related changes in brain structure and function in neurodevelopmental conditions

MRI economics: Balancing sample size and scan duration in brain wide association studies

External validation: a simulation study to compare cross-validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients