Random Forest Algorithm for Prediction of HIV Drug Resistance

Letícia M. Raposo,Paulo Tadeu C. R. Rosa,Flavio F. Nobre
DOI: https://doi.org/10.1007/978-3-030-38021-2_6
2020-01-01
Abstract:Random forest algorithm is a popular choice for genomic data analysis and bioinformatics research. The fundamental idea behind this technique is to combine many decision trees into a single model and use the random subspace method for selection of predictor variables. It is a nonparametric algorithm, efficient for both regression and classification problems, and has a good predictive performance for many types of data. This chapter describes the general characteristics of the random forest algorithm, showing, in practice, a comprehensive application of how this approach can be applied to predict HIV-1 drug resistance. The random forest results were compared to the other two models, logistic regression and classification tree, and presented lower variability in its results, showing to be a classifier with greater stability.
What problem does this paper attempt to address?