TY  - GEN
AB  - This dissertation develops a novel stochastic tree ensemble method for nonlinear regression, which I refer to as XBART, short for Accelerated Bayesian Additive Regression Trees. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning approaches, the new method attains state-of-the-art performance: in many settings it is both faster and more accurate than the widely-used XGBoost algorithm. Via careful simulation studies, I demonstrate that our new approach provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost and neural networks (using Keras). This dissertation also prove a number of basic theoretical results about the new algorithm, including consistency of the single tree version of the model and stationarity of the Markov chain produced by the ensemble version. Furthermore, I demonstrate that initializing standard Bayesian additive regression trees Markov chain Monte Carlo (MCMC) at XBART-fitted trees considerably improves credible interval coverage and reduces total run-time.
AD  - University of Chicago
AU  - He, Jingyu
DA  - 2020-06
DO  - 10.6082/uchicago.2324
DO  - doi
ED  - P. Richard Hahn
ED  - Nicholas G. Polson
ED  - Tengyuan Liang
ED  - Ruey S. Tsay
ID  - 2324
KW  - Statistics
KW  - Computer science
KW  - Bayesian
KW  - Machine Learning
KW  - Markov chain Monte Carlo
KW  - Regression Trees
KW  - Supervised Learning
KW  - Tree ensembles
L1  - https://knowledge.uchicago.edu/record/2324/files/He_uchicago_0330D_15292.pdf
L2  - https://knowledge.uchicago.edu/record/2324/files/He_uchicago_0330D_15292.pdf
L4  - https://knowledge.uchicago.edu/record/2324/files/He_uchicago_0330D_15292.pdf
LA  - eng
LK  - https://knowledge.uchicago.edu/record/2324/files/He_uchicago_0330D_15292.pdf
N2  - This dissertation develops a novel stochastic tree ensemble method for nonlinear regression, which I refer to as XBART, short for Accelerated Bayesian Additive Regression Trees. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning approaches, the new method attains state-of-the-art performance: in many settings it is both faster and more accurate than the widely-used XGBoost algorithm. Via careful simulation studies, I demonstrate that our new approach provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost and neural networks (using Keras). This dissertation also prove a number of basic theoretical results about the new algorithm, including consistency of the single tree version of the model and stationarity of the Markov chain produced by the ensemble version. Furthermore, I demonstrate that initializing standard Bayesian additive regression trees Markov chain Monte Carlo (MCMC) at XBART-fitted trees considerably improves credible interval coverage and reduces total run-time.
PB  - University of Chicago
PY  - 2020-06
T1  - XBART: A Scalable Stochastic Algorithm for Supervised Machine Learning with Additive Tree Ensembles
TI  - XBART: A Scalable Stochastic Algorithm for Supervised Machine Learning with Additive Tree Ensembles
UR  - https://knowledge.uchicago.edu/record/2324/files/He_uchicago_0330D_15292.pdf
Y1  - 2020-06
ER  -