Introdu Research on the Influencing Factors of Diabetes Prediction Model among Pima Indian Heritages Female Data

Yaxuan Huang
DOI: https://doi.org/10.54097/rf0dyd88
2024-06-18
Abstract:Diabetes, as the chronic disease that has the greatest impact on the health of the world, has caused an immeasurable impact on people's lives. This research employed Multiple Linear Regression to explore various factors contributing to the incidence of diabetes. The data contains multiple variables that may predict diabetes. The goal was to assess their predictive value on the diabetes binary outcome through a multiple linear regression model. Diagnostic plots indicated potential model violations, with the residuals vs. fitted values plot suggesting non-linear relationships and variable residual variance. The normal Q-Q plot revealed minor departures from normality, principally at the data extremes. The Scale-Location plot raised questions about the consistency of variance across the data, while leverage analysis did not identify any significantly influential data points. The MLR model yielded an R-squared value of 0.3033, suggesting that approximately 30% of the variability in diabetes outcomes was accounted for by the model. There were some variables that were found to be very significant: Glucose and BMI, which showed p-values that suggested strong associations with diabetes. The findings from this study have identified important determinants of health and have also indicated reasons why there is a need for improvements in the model diagnostics for diabetes. For better health care outcomes, these findings show that diabetes risks are complex and that predictive capacities could benefit from more sophisticated analytical models.
What problem does this paper attempt to address?