Log-Linear Bayesian Additive Regression Trees for Multinomial Logistic and Count Regression Models

Jared S. Murray
DOI: https://doi.org/10.48550/arXiv.1701.01503
2019-08-27
Abstract:We introduce Bayesian additive regression trees (BART) for log-linear models including multinomial logistic regression and count regression with zero-inflation and overdispersion. BART has been applied to nonparametric mean regression and binary classification problems in a range of settings. However, existing applications of BART have been limited to models for Gaussian "data", either observed or latent. This is primarily because efficient MCMC algorithms are available for Gaussian likelihoods. But while many useful models are naturally cast in terms of latent Gaussian variables, many others are not -- including models considered in this paper. We develop new data augmentation strategies and carefully specified prior distributions for these new models. Like the original BART prior, the new prior distributions are carefully constructed and calibrated to be flexible while guarding against overfitting. Together the new priors and data augmentation schemes allow us to implement an efficient MCMC sampler outside the context of Gaussian models. The utility of these new methods is illustrated with examples and an application to a previously published dataset.
Methodology
What problem does this paper attempt to address?