Abstract:Introduction: Machine Learning (ML) has emerged as a promising approach in healthcare, outperforming traditional statistical techniques. However, to establish ML as a reliable tool in clinical practice, adherence to best practices in data handling , and modeling design and assessment is crucial. In this work, we summarize and strictly adhere to such practices to ensure reproducible and reliable ML. Specifically, we focus on Alzheimer's Disease (AD) detection, a challenging problem in healthcare. Additionally, we investigate the impact of modeling choices, including different data augmentation techniques and model complexity, on overall performance. Methods: We utilize Magnetic Resonance Imaging (MRI) data from the ADNI corpus to address a binary classification problem using 3D Convolutional Neural Networks (CNNs). Data processing and modeling are specifically tailored to address data scarcity and minimize computational overhead. Within this framework, we train 15 predictive models, considering three different data augmentation strategies and five distinct 3D CNN architectures with varying convolutional layers counts. The augmentation strategies involve affine transformations, such as zoom, shift , and rotation , applied either concurrently or separately. Results: The combined effect of data augmentation and model complexity results in up to 10% variation in prediction accuracy. Notably, when affine transformation are applied separately, the model achieves higher accuracy, regardless the chosen architecture. Across all strategies, the model accuracy exhibits a concave behavior as the number of convolutional layers increases, peaking at an intermediate value. The best model reaches excellent performance both on the internal and additional external testing set. Discussions: Our work underscores the critical importance of adhering to rigorous experimental practices in the field of ML applied to healthcare. The results clearly demonstrate how data augmentation and model depth—often overlooked factors– can dramatically impact final performance if not thoroughly investigated. This highlights both the necessity of exploring neglected modeling aspects and the need to comprehensively report all modeling choices to ensure reproducibility and facilitate meaningful comparisons across studies.

Adopting transfer learning for neuroimaging: a comparative analysis with a custom 3D convolution neural network model

A 3D deep learning model to predict the diagnosis of dementia with Lewy bodies, Alzheimer’s disease, and mild cognitive impairment using brain 18F-FDG PET

A Novel Transfer Learning Approach to Enhance Deep Neural Network Classification of Brain Functional Connectomes

Deep Learning-Based Classification and Voxel-Based Visualization of Frontotemporal Dementia and Alzheimer’s Disease

Transfer Learning for Alzheimer's Disease Diagnosis from MRI Slices: A Comparative Study of Deep Learning Models

Disease‐driven domain generalization for neuroimaging‐based assessment of Alzheimer's disease

Ad-Tl: Alzheimer’s Disease Prediction Using Transfer Learning

The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study

Transfer Learning with intelligent training data selection for prediction of Alzheimer's Disease

Deep learning-based Alzheimer's disease detection: reproducibility and the effect of modeling choices

Classification and Visualization of Alzheimer’s Disease using Volumetric Convolutional Neural Network and Transfer Learning

3D transfer learning network for classification of Alzheimer’s disease with MRI

Alzheimer’s disease detection from magnetic resonance imaging: a deep learning perspective

Systematic comparison of 3D Deep learning and classical machine learning explanations for Alzheimer's Disease detection

A comparative study of early stage Alzheimer's disease classification using various transfer learning CNN frameworks

Efficient Training on Alzheimer's Disease Diagnosis with Learnable Weighted Pooling for 3D PET Brain Image Classification

A 3D convolutional neural network to classify subjects as Alzheimer's disease, frontotemporal dementia or healthy controls using brain 18F-FDG PET

Comparison of machine learning approaches for enhancing Alzheimer’s disease classification

Deep Learning and Transfer Learning for Brain Tumor Detection and Classification

Attention-based 3D CNN with Multi-layer Features for Alzheimer's Disease Diagnosis using Brain Images

Image-encoded biological and non-biological variables may be used as shortcuts in deep learning models trained on multisite neuroimaging data