Random Forest Ensemble Visualization

Ken Lau
Abstract:The Random forest model for machine learning has become a very popular data mining algorithm due to its high predictive accuracy as well as simiplicity in execution. The downside is that the model is difficult to interpret. The model consists of a collection of classification trees. Our proposed visualization aggregates the collection of trees based on the number of feature appearances at node positions. This derived attribute provides a means of analyzing feature interactions. By using traditional methods such as variable importance, it is not possible to determine feature interactions. In addition, we propose a method of quantifying the ensemble of trees based on correlation of class predictions.
Computer Science
What problem does this paper attempt to address?