On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA
Bayesian Network Classifiers Class Latent Test 1 Test 2 Test 3 Test 4 Features
Bayesian Network Classifiers Class Latent Test 1 Test 2 Test 3 Test 4 Features
Bayesian Network Classifiers Class Latent Can we make the same classifications with fewer features? Test 1 Test 2 Test 3 Test 4 Features
Why Classification Similarity? To preserve classification behavior on individual examples • Fairness • Deployed classifiers
How to measure Similarity? “Expected Classification Agreement” What is the expected probability that a classifier α will agree with its trimming β ?
Robust Trimming Trimmed classifier Original classifier Similarity
Trimming Algorithm Feature subset selection “Maximum Achievable Agreement” Search Objective function
Trimming Algorithm • Branch-and-Bound search
Trimming Algorithm • Branch-and-Bound search • Need a bound for MAA to prune subtrees
Upper-bound for MAA “Maximum Potential Agreement” Maximum agreement between α and a hypothetical function that maps f’ to c
Maximum Potential Agreement 1. Upper-bounds the MAA Great for pruning! 2. Monotonically increasing
Maximum Potential Agreement 1. Upper-bounds the MAA Great for pruning! 2. Monotonically increasing 3. Generally easier to compute than MAA 4. Equal to MAA given some independence condition (e.g. Naïve Bayes)
Computing the MPA and MAA Prior works based on knowledge compilation D 𝐸 Pr 𝑆 1 = + 𝐸) Pr(𝐸 = +) ∙∙∙ + 0.7 0.2 − 0.2 R2 R1 AC 𝑄 1 ⟺ ¬𝐸⋀¬𝑆 1 𝑄 2 ⟺ ¬𝐸⋀¬𝑆 1 𝑄 3 ⟺ ¬𝐸⋀¬𝑆 1 𝑄 4 ⟺ ¬𝐸⋀¬𝑆 1 𝑥 𝑄 1 = 0.7 𝑥(𝑄 2 ) = 0.3 𝑥 𝑄 3 = 0.2 𝑥(𝑄 4 ) = 0.8 𝑥 𝑚 = 1.0 for all other literal 𝑚 [Oztok,Choi,Darwiche 2016; C,Darwiche,VdB 2017]
Evaluation
Evaluation Branch-and-bound improves efficiency (even with extra upper-bound computations)
Evaluation High information gain does not lead to high classification agreement Information-theoretic measures unaware of changes in classification threshold
Thank you! Questions?
Recommend
More recommend