incorporating grouping information into bayesian decision
play

Incorporating Grouping Information Into Bayesian Decision Tree - PowerPoint PPT Presentation

Incorporating Grouping Information Into Bayesian Decision Tree Ensembles Junliang Du Antonio R. Linero 1 / 7 Grouping Structures Common scenarios: omics, with groups corresponding to groups of genes or groups of SNPs. 1 / 7 Additive Models


  1. Incorporating Grouping Information Into Bayesian Decision Tree Ensembles Junliang Du Antonio R. Linero 1 / 7

  2. Grouping Structures Common scenarios: omics, with groups corresponding to groups of genes or groups of SNPs. 1 / 7

  3. Additive Models Assume target f ( x ) decomposes additively as m � f ( x ) = g ( x ; T t , M t ) , t =1 for some adaptively chosen basis functions g ( x ; T t , M t ). BART: basis functions are decision trees; similar in many respects to gradient boosting + decision trees. 2 / 7

  4. Variable Importance Define the variable importance s j of predictor j as Pr(a given decision rule uses predictor j ) . For example, the probability of splitting on x 2 and x 3 in this tree is s 2 · s 3 . Near sparse s = ⇒ small subset of predictors used. 3 / 7

  5. Overlapping Group BART LDA-like model: Sampling predictor j arises by 1. sampling a group according to π ; and 2. sampling a predictor-within-group according to w g . 4 / 7

  6. Overlapping Group BART LDA-like model: Sampling predictor j arises by 1. sampling a group according to π ; and 2. sampling a predictor-within-group according to w g . Set s = Wπ, π ∈ S G − 1 , w g ∈ S P − 1 . 4 / 7

  7. Overlapping Group BART LDA-like model: Sampling predictor j arises by 1. sampling a group according to π ; and 2. sampling a predictor-within-group according to w g . Set s = Wπ, π ∈ S G − 1 , w g ∈ S P − 1 . Incorporate grouping information into sparsity pattern of w g = ( w g 1 , . . . , w gP ). Sparsity inducing prior on π and w g = ⇒ bi-level selection! 4 / 7

  8. Simulation Studies Nonparametric ground truth (one relevant group, 5 relevant predictors, 50 members of group, 500 predictors). F1 FN 1.0 3 0.8 2 0.6 1 GB-Correct(1,1) 0.4 0 GB-Correct(10,10) GB-Wrong(1,1) FP RMSE 4 GB-Wrong(10,10) 1.5 SB 3 1.0 2 0.5 1 0.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 σ 5 / 7

  9. Breast Cancer Data Cross validation suggests encouraging performance on breast cancer dataset of Van De Vijver et al. (2002) (classification of metastatic/non-metastatic tumors) Method Average Heldout Deviance OG-BART 620 SBART 646 (0 . 005) OG-Lasso 797 ( < 0 . 0001) cMCP 698 (0 . 014) 6 / 7

  10. Thanks! 7 / 7

  11. Bleich, J., Kapelner, A., George, E. I., and Jensen, S. T. (2014). Variable selection for BART: An application to gene regulation. The Annals of Applied Statistics , 8(3):1750–1781. Van De Vijver, M. J., He, Y. D., Van’t Veer, L. J., Dai, H., Hart, A. A., Voskuil, D. W., Schreiber, G. J., Peterse, J. L., Roberts, C., and Marton, M. J. (2002). A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine , 347(25):1999–2009. 7 / 7

Recommend


More recommend