Analyzing Preprocessing Impact on Machine Learning Classifiers for Cryotherapy and Immunotherapy Dataset

De Rosal Ignatius Moses Setiadi,Hussain Md Mehedul Islam,Gustina Alfa Trisnapradika,Wise Herowati
DOI: https://doi.org/10.62411/faith.2024-2
2024-06-01
Abstract:In the clinical treatment of skin diseases and cancer, cryotherapy and immunotherapy offer effective and minimally invasive alternatives. However, the complexity of patient response demands more sophisticated analytical strategies for accurate outcome prediction. This research focuses on analyzing the effect of preprocessing in various machine learning models on the prediction performance of cryotherapy and immunotherapy. The preprocessing techniques analyzed are advanced feature engineering and Synthetic Minority Over-sampling Technique (SMOTE) and Tomek links as resampling techniques and their combination. Various classifiers, including support vector machine (SVM), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), XGBoost, and Bidirectional Gated Recurrent Unit (BiGRU), were tested. The findings of this study show that preprocessing methods can significantly improve model performance, especially in the XGBoost model. Random Forest also gets the same results as XGBoost, but it can also work better without significant preprocessing. The best results were 0.8889, 0.8889, 0.6000, 0.9037, and 0.8790, respectively, for accuracy, recall, specificity, precision, and f1 on the Immunotherapy dataset, while on the Cryotherapy dataset, respectively, they were 0.8889, 0.8889, 0.6000, 0.9037, and 0.8790. This study confirms the potential of customized preprocessing and machine learning models to provide deep insights into treatment dynamics, ultimately improving the quality of diagnosis.
What problem does this paper attempt to address?