Development of a differential treatment selection model for depression on consolidated and transformed clinical trial datasets

Kelly Perlman,Joseph Mehltretter,David Benrimoh,Caitrin Armstrong,Robert Fratila,Christina Popescu,Jingla-Fri Tunteng,Jerome Williams,Colleen Rollins,Grace Golden,Gustavo Turecki
DOI: https://doi.org/10.1038/s41398-024-02970-4
2024-06-22
Translational Psychiatry
Abstract:Major depressive disorder (MDD) is the leading cause of disability worldwide, yet treatment selection still proceeds via "trial and error". Given the varied presentation of MDD and heterogeneity of treatment response, the use of machine learning to understand complex, non-linear relationships in data may be key for treatment personalization. Well-organized, structured data from clinical trials with standardized outcome measures is useful for training machine learning models; however, combining data across trials poses numerous challenges. There is also persistent concern that machine learning models can propagate harmful biases. We have created a methodology for organizing and preprocessing depression clinical trial data such that transformed variables harmonized across disparate datasets can be used as input for feature selection. Using Bayesian optimization, we identified an optimal multi-layer dense neural network that used data from 21 clinical and sociodemographic features as input in order to perform differential treatment benefit prediction. With this combined dataset of 5032 individuals and 6 drugs, we created a differential treatment benefit prediction model. Our model generalized well to the held-out test set and produced similar accuracy metrics in the test and validation set with an AUC of 0.7 when predicting binary remission. To address the potential for bias propagation, we used a bias testing performance metric to evaluate the model for harmful biases related to ethnicity, age, or sex. We present a full pipeline from data preprocessing to model validation that was employed to create the first differential treatment benefit prediction model for MDD containing 6 treatment options.
psychiatry
What problem does this paper attempt to address?
The paper aims to address the personalization issue in depression treatment options by developing a differential treatment selection model to predict the response of different patients to various antidepressants. The main objectives include: 1. **Addressing the "trial and error" method in treatment selection**: Current depression treatment choices often rely on a "trial and error" approach, which is inefficient and may lead to worsened disease outcomes. The paper proposes using machine learning techniques to handle complex, non-linear data relationships to achieve personalized treatment plans. 2. **Dataset integration and preprocessing**: There is heterogeneity among depression clinical trial datasets, which poses challenges for data integration. The authors created a methodology to organize and preprocess these datasets, allowing variables across datasets to be unified and used for feature selection. 3. **Developing a differential treatment benefit prediction model**: Using Bayesian optimization, the optimal multi-layer neural network architecture was determined. This model predicts differential treatment benefits based on 21 clinical and socio-demographic features as inputs. 4. **Avoiding model bias**: Considering that machine learning models may amplify harmful biases, the research team included a bias test performance metric in the model evaluation to ensure the model does not produce unfair results based on race, age, or gender. In summary, the goal of this research is to integrate data from multiple clinical trials and use machine learning techniques (particularly deep learning) to develop a model that can predict patient responses to different antidepressants, thereby achieving more personalized treatment options. Additionally, the study focuses on the fairness of the model and reducing potential biases.