Machine learning using Stata/Python

Giovanni Cerulli
DOI: https://doi.org/10.1177/1536867x221140944
2022-12-01
Abstract:I present two related commands, r_ml_stata_cv and c_ml_stata_cv, for fitting popular machine learning methods in both a regression and a classification setting. Using the recent Stata/Python integration platform introduced in Stata 16, these commands provide hyperparameters’ optimal tuning via K-fold cross-validation using grid search. More specifically, they use the Python Scikitlearn application programming interface to carry out both cross-validation and outcome/label prediction.
statistics & probability,social sciences, mathematical methods
What problem does this paper attempt to address?