party on a new conditional variable
play

Party on! A new, conditional variable importance A new, conditional - PowerPoint PPT Presentation

Measuring variable Party on! A new, conditional variable importance A new, conditional importance measure for random forests importance Conclusion available in party References Carolin Strobl (LMU M unchen) and Achim Zeileis (WU Wien)


  1. Measuring variable Party on! A new, conditional variable importance A new, conditional importance measure for random forests importance Conclusion available in party References Carolin Strobl (LMU M¨ unchen) and Achim Zeileis (WU Wien) useR! 2009

  2. Introduction Measuring variable random forests importance A new, conditional importance ◮ have become increasingly popular in, e.g., genetics and Conclusion the neurosciences References ◮ can deal with “small n large p”-problems, high-order interactions, correlated predictor variables ◮ are used not only for prediction, but also to measure variable importance (advantage: RF variable importance measures capture the effect of a variable in main effects and interactions → smarter for screening than univariate measures)

  3. (Small) random forest Measuring variable importance 1 1 1 1 Start Start Start Start p < 0.001 p < 0.001 p < 0.001 p < 0.001 ≤ ≤ 8 > > 8 ≤ 12 > 12 ≤ 1 > 1 ≤ ≤ 8 > 8 > 2 3 2 7 2 3 n = 13 Age Age n = 49 n = 8 Number A new, conditional y = (0.308, 0.692) p < 0.001 p < 0.001 y = (1, 0) y = (0.375, 0.625) p < 0.001 2 3 n = 15 Start ≤ 87 ≤ > 87 ≤ 68 > 68 ≤ 4 > 4 y = (0.4, 0.6) p < 0.001 4 5 3 6 4 7 importance n = 36 Start Number n = 12 Age n = 31 ≤ 14 ≤ > 14 > y = (1, 0) p < 0.001 p < 0.001 y = (0.25, 0.75) p < 0.001 y = (0.806, 0.194) ≤ 13 > 13 ≤ 4 > 4 ≤ 125 > 125 4 5 n = 34 n = 32 6 7 4 5 5 6 y = (0.882, 0.118) y = (1, 0) n = 16 n = 16 n = 11 n = 9 n = 31 n = 11 y = (0.75, 0.25) y = (1, 0) y = (1, 0) y = (0.556, 0.444) y = (1, 0) y = (0.818, 0.182) Conclusion 1 1 1 Number 1 Start Start p < 0.001 Start p < 0.001 p < 0.001 p < 0.001 ≤ ≤ 5 > 5 2 9 ≤ 12 > 12 ≤ 14 > 14 Age n = 11 References ≤ ≤ 12 > 12 > p < 0.001 y = (0.364, 0.636) 2 7 2 7 Age Number Age n = 35 ≤ 81 ≤ > 81 > p < 0.001 p < 0.001 p < 0.001 y = (1, 0) 2 3 3 4 n = 38 Number n = 33 Start ≤ 18 > 18 ≤ 3 > 3 ≤ 71 > 71 y = (0.711, 0.289) p < 0.001 y = (1, 0) p < 0.001 3 4 8 9 3 4 ≤ 12 ≤ > > 12 n = 10 Number n = 28 n = 21 n = 15 Start 5 6 ≤ 3 ≤ > 3 > y = (0.9, 0.1) p < 0.001 y = (1, 0) y = (0.952, 0.048) y = (0.933, 0.067) p < 0.001 n = 13 Start y = (0.385, 0.615) p < 0.001 ≤ 4 > 4 ≤ 12 > 12 4 5 ≤ 15 > 15 n = 25 n = 18 5 6 5 6 7 8 y = (1, 0) y = (0.889, 0.111) n = 12 n = 10 n = 16 n = 15 n = 12 n = 12 y = (0.417, 0.583) y = (0.2, 0.8) y = (0.375, 0.625) y = (0.733, 0.267) y = (0.833, 0.167) y = (1, 0) 1 1 Start 1 1 Number p < 0.001 Start Start p < 0.001 p < 0.001 p < 0.001 ≤ 12 ≤ > 12 > ≤ 6 > 6 2 7 2 7 ≤ ≤ 12 > 12 ≤ 8 > 8 Age Start Number n = 10 p < 0.001 p < 0.001 p < 0.001 y = (0.5, 0.5) 2 5 2 5 Age Start Start Age ≤ 27 ≤ > 27 > ≤ ≤ 13 > > 13 ≤ 3 > 3 p < 0.001 p < 0.001 p < 0.001 p < 0.001 3 4 8 9 3 6 n = 10 Number n = 11 n = 37 Start n = 37 y = (1, 0) p < 0.001 y = (0.818, 0.182) y = (1, 0) ≤ ≤ 81 > 81 > ≤ 13 > 13 ≤ 3 > 3 ≤ 136 > 136 p < 0.001 y = (0.865, 0.135) ≤ ≤ 4 > 4 > ≤ 13 > 13 3 4 6 7 3 4 6 7 5 6 n = 20 n = 16 n = 11 n = 34 n = 12 n = 14 n = 47 n = 8 4 5 n = 14 n = 9 y = (0.85, 0.15) y = (0.188, 0.812) y = (0.818, 0.182) y = (1, 0) y = (0.667, 0.333) y = (0.143, 0.857) y = (1, 0) y = (0.75, 0.25) n = 10 n = 24 y = (0.357, 0.643) y = (0.111, 0.889) y = (0.8, 0.2) y = (1, 0) 1 1 1 1 Start Start Start Start p < 0.001 p < 0.001 p < 0.001 p < 0.001 ≤ 8 > 8 2 3 ≤ ≤ 8 > 8 > ≤ ≤ 12 > 12 ≤ 12 > 12 n = 18 Start y = (0.5, 0.5) p < 0.001 2 5 2 5 2 3 Start Start Age Start n = 28 Start ≤ 12 > 12 p < 0.001 p < 0.001 p < 0.001 p < 0.001 y = (0.607, 0.393) p < 0.001 4 5 n = 18 Number ≤ 1 ≤ > 1 > ≤ ≤ 12 > 12 > ≤ ≤ 71 > 71 > ≤ 14 > 14 ≤ 14 > 14 y = (0.833, 0.167) p < 0.001 ≤ 3 > 3 3 4 6 7 3 4 6 7 4 5 n = 9 n = 13 n = 12 n = 47 n = 15 n = 17 n = 17 n = 32 n = 21 n = 32 6 7 y = (0.778, 0.222) y = (0.154, 0.846) y = (0.833, 0.167) y = (1, 0) y = (0.667, 0.333) y = (0.235, 0.765) y = (0.882, 0.118) y = (1, 0) y = (0.905, 0.095) y = (1, 0) n = 30 n = 15 y = (1, 0) y = (0.933, 0.067)

  4. Measuring variable importance Measuring variable importance A new, conditional importance Conclusion References

  5. Measuring variable importance Measuring variable importance A new, conditional ◮ Gini importance importance mean Gini gain produced by X j over all trees Conclusion (can be severely biased due to estimation bias and References mutiple testing; Strobl et al., 2007)

  6. Measuring variable importance Measuring variable importance A new, conditional ◮ Gini importance importance mean Gini gain produced by X j over all trees Conclusion (can be severely biased due to estimation bias and References mutiple testing; Strobl et al., 2007) ◮ permutation importance mean decrease in classification accuracy after permuting X j over all trees (unbiased when subsampling is used; Strobl et al., 2007)

  7. The permutation importance within each tree t Measuring variable importance A new, conditional importance � � � � y ( t ) y ( t ) Conclusion � ( t ) I y i = ˆ � ( t ) I y i = ˆ i i ,π j i ∈ B i ∈ B VI ( t ) ( x j ) = − References � ( t ) � � ( t ) � � B � B � � � � � � y ( t ) = f ( t ) ( x i ) = predicted class before permuting ˆ i y ( t ) i ,π j = f ( t ) ( x i ,π j ) = predicted class after permuting X j ˆ � x i ,π j = ( x i , 1 , . . . , x i , j − 1 , x π j ( i ) , j , x i , j +1 , . . . , x i , p Note: VI ( t ) ( x j ) = 0 by definition, if X j is not in tree t

  8. The permutation importance Measuring variable importance A new, conditional importance Conclusion over all trees: References � ntree t =1 VI ( t ) ( x j ) VI ( x j ) = ntree

  9. What null hypothesis does this permutation scheme correspond to? Measuring variable importance A new, conditional importance obs Y X j Z Conclusion 1 y 1 x π j (1) , j z 1 References . . . . . . . . . . . . i y i x π j ( i ) , j z i . . . . . . . . . . . . n y n x π j ( n ) , j z n H 0 : X j ⊥ Y , Z or X j ⊥ Y ∧ X j ⊥ Z H 0 P ( Y , X j , Z ) = P ( Y , Z ) · P ( X j )

  10. What null hypothesis does this permutation scheme correspond to? Measuring variable importance A new, conditional importance Conclusion the current null hypothesis reflects independence of X j from References both Y and the remaining predictor variables Z ⇒ a high variable importance can result from violation of either one!

  11. Suggestion: Conditional permutation scheme Measuring variable importance obs Y X j Z 1 z 1 = a y 1 x π j | Z = a (1) , j A new, conditional importance 3 y 3 x π j | Z = a (3) , j z 3 = a Conclusion 27 y 27 x π j | Z = a (27) , j z 27 = a References 6 y 6 x π j | Z = b (6) , j z 6 = b 14 y 14 x π j | Z = b (14) , j z 14 = b 33 y 33 x π j | Z = b (33) , j z 33 = b . . . . . . . . . . . . H 0 : X j ⊥ Y | Z H 0 P ( Y , X j | Z ) = P ( Y | Z ) · P ( X j | Z ) H 0 or P ( Y | X j , Z ) = P ( Y | Z )

  12. Technically Measuring variable importance A new, conditional importance Conclusion References ◮ use any partition of the feature space for conditioning

  13. Technically Measuring variable importance A new, conditional importance Conclusion References ◮ use any partition of the feature space for conditioning ◮ here: use binary partition already learned by tree

  14. Simulation study Measuring variable i . i . d . ◮ dgp: y i = β 1 · x i , 1 + · · · + β 12 · x i , 12 + ε i , ε i ∼ N (0 , 0 . 5) importance ◮ X 1 , . . . , X 12 ∼ N (0 , Σ ) A new, conditional importance   1 0 . 9 0 . 9 0 . 9 0 · · · 0 Conclusion   0 . 9 1 0 . 9 0 . 9 0 · · · 0 References     0 . 9 0 . 9 1 0 . 9 0 · · · 0       0 . 9 0 . 9 0 . 9 1 0 · · · 0 Σ =       0 0 0 0 1 · · · 0    . . . . .  ... . . . . .   . . . . . 0     0 0 0 0 0 0 1 X j X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 · · · X 12 β j 5 5 2 0 -5 -5 -2 0 · · · 0

  15. Results Measuring variable importance 25 mtry = 1 A new, conditional 15 ● ● ● ● importance ● ● 5 Conclusion ● 0 ● ● ● ● ● References 50 mtry = 3 30 ● ● ● ● 10 ● ● ● 0 ● ● ● ● ● 80 mtry = 8 60 40 ● ● 20 ● ● ● ● 0 ● ● ● ● ● ● 1 2 3 4 5 6 7 8 9 10 11 12 variable

  16. Peptide-binding data Measuring variable importance A new, conditional unconditional importance Conclusion 0.005 References 0 conditional 0.005 0 * h2y8 flex8 pol3

Recommend


More recommend