Practical foundations of machine learning for addiction research. Part I. Methods and techniques

Pablo Cresta Morgado,Martín Carusso,Laura Alonso Alemany,Laura Acion
DOI: https://doi.org/10.1080/00952990.2021.1995739
2022-05-04
Abstract:Machine learning assembles a broad set of methods and techniques to solve a wide range of problems, such as identifying individuals with substance use disorders (SUD), finding patterns in neuroimages, understanding SUD prognostic factors and their association, or determining addiction genetic underpinnings. However, the addiction research field underuses machine learning. This two-part narrative review focuses on machine learning tools and concepts, providing an introductory insight into their capabilities to facilitate their understanding and acquisition by addiction researchers. This first part presents supervised and unsupervised methods such as linear models, naive Bayes, support vector machines, artificial neural networks, and k-means. We illustrate each technique with examples of its use in current addiction research. We also present some open-source programming tools and methodological good practices that facilitate using these techniques. Throughout this work, we emphasize a continuum between applied statistics and machine learning, we show their commonalities, and provide sources for further reading to deepen the understanding of these methods. This two-part review is a primer for the next generation of addiction researchers incorporating machine learning in their projects. Researchers will find a bridge between applied statistics and machine learning, ways to expand their analytical toolkit, recommendations to incorporate well-established good practices in addiction data analysis (e.g., stating the rationale for using newer analytical tools, calculating sample size, improving reproducibility), and the vocabulary to enhance collaboration between researchers who do not conduct data analyses and those who do.
What problem does this paper attempt to address?