Survival Prediction and Comparison of the Titanic based on Machine Learning Classifiers

Tony Wayne Wang
DOI: https://doi.org/10.62051/8fcnnp84
2024-08-12
Abstract:This study conducts a comparison of machine learning algorithms, including Logistic Regression (LR), Decision Tree Classifier (DT), and Random Forest Classifier (RF), to predict the survival outcomes of passengers on the Titanic. The dataset used in the study includes variables such as socio-economic status, age, gender, and family relationships; this paper meticulously prepares and analyzes the data to train and evaluate these models. The study's objective is to determine the impact of various passenger features on survival outcomes, employing machine learning algorithms to generate survival predictions. The findings demonstrate that the RF model, particularly with 45 or 75 trees, significantly outperforms LR and DT in terms of precision and recall, establishing it as a more robust classifier for this dataset. The research underscores the importance of the utility of different machine learning models for binary classification tasks and the role of parameter tuning in enhancing model performance. This comparative analysis not only contributes to the ongoing exploration of the Titanic disaster through data science but also highlights key considerations in the application of machine learning algorithms for predictive modeling.
What problem does this paper attempt to address?