Artificial Intelligence-Based Molecular Property Prediction of Photosensitizing Effects of Drugs

Amun Georg Hofmann,Asan Agibetov
DOI: https://doi.org/10.26434/chemrxiv-2023-1zzc8
2023-12-07
Abstract:Introduction: Drug-induced photosensitivity is an adverse event of various agents that are used in all major specialties of clinical medicine. Apart from the acute condition, an association of photosensitive events and an increased risk of skin cancer have been repeatedly reported. However, photosensitizing properties of drugs and chemical compounds are also deliberately utilized as a treatment modality, for example as photodynamic therapy in oncology. While certain chemical features have been shown to induce photosensitivity more frequently, the matter is still not conclusively understood and commonly used photobiological assays are discussed to be affected by several limitations. In the present work we investigated the feasibility of predicting photosensitizing effects of drugs and chemical compounds via state-of-the-art artificial intelligence-based workflows. Methods: A dataset of 2,200 drugs was used to train three distinct models (logistic regression, XGBoost, and a deep learning model) to predict photosensitizing attributes based on the SMILES string. Labels were obtained from a list of previously published photosensitizers resulting in 205 photosensitizing drugs. Data was partitioned using an 80/10/10 training-validation-test split by molecular scaffold. External evaluation of the different models was performed using the tox21 dataset and included a technical interpretation of prediction scores as well as a pharmacological interpretation. Results: ROC-AUC ranged between 0.8939 (deep learning model) and 0.9525 (XGBoost) during training, while in the test partition it ranged between 0.7785 (deep learning) and 0.7927 (XGBoost). The models were employed to facilitate predictions on the external validation set. Analysis of the top 200 compounds of each model resulted in 55 overlapping molecules. Fifteen of those were fluoroquinolones, a class of commonly reported photosensitizers. Prediction scores in this subset corresponded well with culprit substructures suspected of mediating photosensitizing effects. Discussion: All three models appeared capable of predicting photosensitizing effects of chemical compounds. However, compared to the simpler model (logistic regression) the complex models (XGBoost and Chemprop) appeared to be more confident in their predictions as exhibited by their distribution of prediction scores. The evaluation of the models on external data further solidified the feasibility of molecular property prediction for photosensitizing abilities. A qualitative analysis of fluoroquinolones in the external dataset based on available photobiological evidence showed that their prediction scores corresponded well with their chemical structure.
Chemistry
What problem does this paper attempt to address?
The paper attempts to address the problem of predicting the photosensitizing effects of drugs and chemical compounds. Specifically, the researchers aim to use artificial intelligence techniques, particularly machine learning and deep learning methods, to predict whether a drug has photosensitizing properties. This is clinically important because drug-induced photosensitivity reactions can not only cause acute skin problems but also increase the risk of skin cancer. Additionally, photosensitizing properties are used in treatments such as photodynamic therapy. ### Main Research Background: 1. **Drug-induced Photosensitivity**: Many drugs cause adverse skin reactions under light exposure, including phototoxicity and photoallergy. These reactions not only affect the quality of life of patients but may also increase the risk of skin cancer. 2. **Utilization of Photosensitizing Properties**: Photosensitizing properties are also used in treatments, such as the application of photodynamic therapy in oncology, ophthalmology, and dermatology. 3. **Limitations of Existing Methods**: Currently used photobiological testing methods have certain limitations, such as issues with sensitivity and specificity. ### Research Objectives: - **Predictive Capability**: To predict the photosensitizing effects of drugs and chemical compounds by building machine learning and deep learning models. - **Model Validation**: To validate the generalization ability and predictive accuracy of the models using external datasets (e.g., tox21). - **Mechanism Analysis**: To explore the rationality of model predictions by analyzing the structural characteristics of specific drugs (e.g., fluoroquinolones). ### Method Overview: - **Dataset**: A dataset of 2,200 drugs was used for training, of which 205 are known to have photosensitizing properties. - **Models**: Three models were constructed: logistic regression, XGBoost, and a deep learning model (Chemprop). - **Evaluation**: The model performance was evaluated using the ROC-AUC metric and validated on external datasets. ### Results and Discussion: - **Model Performance**: XGBoost performed best on the training set, but its performance slightly decreased on the test set, indicating a potential overfitting issue. - **External Validation**: The model's consistent performance on external datasets further validated its predictive capability. - **Mechanism Analysis**: Analysis of fluoroquinolone drugs showed that the model could identify structural features related to photosensitizing effects. ### Conclusion: - **Feasibility**: The study demonstrates that using artificial intelligence methods to predict the photosensitizing effects of drugs is feasible. - **Application Prospects**: These models can assist in drug development, optimize the design of photosensitizers, reduce adverse reactions, and improve therapeutic outcomes. In summary, this study aims to enhance the predictive capability of drug photosensitizing effects through artificial intelligence techniques, thereby better managing and utilizing this property in clinical applications.