Machine Learning Assisted Web Application for Identifying Beneficial Drug Candidates for Genetic Alterations in Cancer Patients

Hakan Bozcuk
DOI: https://doi.org/10.1101/2024.06.03.24308392
2024-06-04
Abstract:Background: Precision medicine in oncology relies heavily on molecular genetic data, primarily obtained from Next-generation sequencing (NGS) tests. However, the complexity of these data and the need to match genetic alterations with specific drug candidates pose significant challenges for clinicians. To simplify this process, a user-friendly web application has been developed. This app facilitates the matching, graphical presentation, and clustering of treatment options for specific genetic alterations, making it easier for clinicians to interpret and apply the results in patient care. Materials and Methods: Utilizing the application programming interface (API) of the Drug-Gene Interaction Database (DGIdb 4.0), a web application was developed in Python to list drugs that interact with specific genetic changes. The application features a user-friendly display achieved through graphical representation and web scraping for gene-related information. Additionally, unsupervised machine learning, specifically K-means cluster analysis, was employed to categorize drug candidates based on their interaction scores with the genetic alteration in question. To enhance the interpretability of the results, the web app also provides key references and web links to the relevant drug interactions. Results: The developed web application successfully filtered, listed, and displayed the gene interaction results. Utilizing an unsupervised machine learning algorithm, the app identified three optimal clusters of drug candidates based on their efficacy potentials using the Elbow Method. The cluster analysis demonstrated strong performance, evidenced by the following metrics for BRAF mutations: a Silhouette score of 0.74, a Davies-Bouldin index of 0.44, and a Calinski-Harabasz index of 475.96. Additionally, the web app effectively extracted and defined relevant gene information and identified key references for each genetic alteration within the cloud database. Conclusion: The web application developed in this study provides a user-friendly platform for classifying and interpreting drug candidates based on the presence of specific genetic alterations in cancer patients. This tool is expected to enhance the accessibility and usability of genetic data, aiding clinicians in making informed treatment decisions.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use machine - learning - assisted web applications to identify potentially beneficial drug candidates associated with specific gene alterations in cancer patients. Specifically, the paper focuses on how to simplify the process from molecular genetic data (primarily obtained through next - generation sequencing technologies) to matching specific drug candidates in the field of precision medicine, in order to help clinicians more effectively interpret and apply these data in patient treatment. The paper proposes a user - friendly web application that is able to: - List drugs that interact with specific gene changes; - Provide gene - related information through graphical representation and web - scraping techniques; - Classify drug candidates according to the interaction scores between drugs and specific gene alterations using unsupervised machine - learning algorithms (especially K - means clustering analysis); - Provide key references and network links of related drug interactions to enhance the interpretability of the results. Through this method, this study aims to improve the accessibility and usability of genetic data, thereby helping clinicians make more informed treatment decisions.