Discriminative Bias for Learning Probabilistic Sentential Decision - PowerPoint PPT Presentation

Discriminative Bias for Learning Probabilistic Sentential Decision Diagrams Laura I. Galindez Olascoaga ✻ , Wannes Meert ✻ , Nimish Shah ✻ , Guy Van den Broeck ✣ , Marian Verhelst ✻ ✣ ✻

Outline Motivation and objective Background Discriminative bias for learning PSDDs Experimental results Conclusions 2 IDA2020

Motivation Probabilistic inference has Probabilistic circuits Some of these models’ proven to be well suited for successfully balance robustness (from resource-constrained efficiency vs. expressiveness generative learning) is at embedded applications . trade-offs while remaining odds with discriminative robust. performance. (Galindez et al. 2019) 3 IDA2020

Objective Keep robustness provided by generative learning strategies. Improve discriminative performance by exploiting knowledge encoding capabilities. 4 IDA2020

Background: probabilistic inference Given a probabilistic model m of the world Answer probabilistic queries q 1 ( m )=Pr m ( ) Evidence Conditional q 2 ( m )=Pr m ( | ) q 3 ( m )=Argmax time Pr m ( ) MAP IDA2020

Background: tractable probabilistic inference A query q( m ) is tractable iff exactly computing it runs in time O(poly(|m|). There is an inherent trade-off between tractability and expressiveness (From UAI 2019 tutorial on Tractable Probabilistic Models by Vergari, Di Mauro and Van den Broeck and AAAI 2020 tutorial on Probabilistic Circuits by Vergari, Choi, Peharz and Van den Broeck) 7 IDA2020

Background: probabilistic circuits A probabilistic circuit is a computational graph that encodes a probability distribution p(X). (From UAI 2019 tutorial on Tractable Probabilistic Models by Vergari, Di Mauro and Van den Broeck and AAAI 2020 tutorial on Probabilistic Circuits by Vergari, Choi, Peharz and Van den Broeck) 8 IDA2020

Background: what is a PSDD? PSDDs are probabilistic extensions to SDDs, which represent Boolean functions as logical circuits (Kisa et al., 2014). Bayesian Network PSDD 1 Pr = 0.2 0.2 0.8 ) = (0.1 𝑗𝑔 2 2 Pr 0.7 𝑗𝑔 1.0 0.1 0.9 ) = (1 𝑗𝑔 ∧ Pr 0 𝑗𝑔 otherwise 0.7: (Example from Liang et al., 2017) 9 IDA2020

Background: PSDDs’ properties Decision node Vtree The left variable of the AND 1 gate is the prime ( p ) 1 𝜄 ! = 0.2 𝜄 " = 0.8 and the right is the sub ( s ). … 2 Edges of decision nodes are … 𝑞 ! 𝑡 ! 𝑞 " 𝑡 " annotated with a normalized probability distribution. 11 IDA2020

Background: PSDDs’ properties Syntactic restrictions: See (Kisa et al., 2014). 1) Decomposability : inputs of PSDD Vtree AND node must be disjoint. 1 1 For example at 1: 0.2 0.8 Prime variables 𝒀 = {𝑆𝑏𝑗𝑜} Sub variables 𝒁 = {𝑇𝑣𝑜, 𝑆𝑐𝑝𝑥} 2 2 2 1.0 2) Determinism : only one of 0.1 0.9 the decision node’s inputs can be true. 0.7: 12 IDA2020

Background: PSDDs’ properties Decision nodes q encode the distribution: Decision node 𝑄𝑠 ' 𝒀𝒁 = ; 𝜄 $ 𝑄𝑠 %$ (𝒀)𝑄𝑠 &$ (𝒁) 1 $ 𝜄 ! = 0.2 𝜄 " = 0.8 𝑄𝑠 # 𝒀𝒁|[𝑞 $ ] = 𝑄𝑠 % ! (𝒀|[𝑞 $ ])𝑄𝑠 & ! (𝒁|[𝑞 $ ]) … … 𝑞 ! 𝑡 ! 𝑞 " 𝑡 " = 𝑄𝑠 % ! (𝒀)𝑄𝑠 & ! (𝒁) A logical sentence that defines the support of node distribution 13 IDA2020

Background: PSDDs’ properties Decision nodes q encode the distribution: Decision node 𝑄𝑠 ' 𝒀𝒁 = ; 𝜄 $ 𝑄𝑠 %$ (𝒀)𝑄𝑠 &$ (𝒁) 1 $ 𝜄 ! = 0.2 𝜄 " = 0.8 𝑄𝑠 ! 𝒀𝒁 = 0.2 ⋅ 𝑄𝑠 # ! 𝒀 𝑄𝑠 $ ! 𝒁 + … … 𝑞 ! 𝑡 ! 𝑞 " 𝑡 " 0.8 ⋅ 𝑄𝑠 # " 𝒀 𝑄𝑠 $ " 𝒁 For example at 1: 0.8 ⋅ 𝑄𝑠 # ! 𝒀|[ ] 𝑄𝑠 $ ! 𝒁|[ ] Prime variables ! = {$%&'} 0.2 ⋅ 𝑄𝑠 # ! 𝒀|[ ] 𝑄𝑠 $ ! 𝒁|[ ] Sub variables ) = {*+', $-./} 14 IDA2020

Background: learning PSDDs The LearnPSDD algorithm (Liang et al., 2017) learns the PSDD structure incrementally from data. Learn vtree from data Iteratively apply split (Minimize mutual and clone operations information) 1 Generate candidate 1 operations 1 0.2 0.8 1.0 2 Calculate log-llk … … … improvement 2 2 2 Execute 0.1 0.9 1.0 best 1.0 operation 0.7: 15 IDA2020

Classification with PSDDs Given a feature variable set 𝑮 and a class variable 𝐷. The classification task can be stated as a probabilistic query: Pr 𝐷 𝑮 ~ Pr 𝑮 𝐷 ⋅ Pr(𝐷) Pr 𝐷 𝑮 ~ Pr 𝑮 𝐷 ⋅ Pr(𝐷) LearnPSDD remains agnostic With LearnPSDD to the classification task features might never be conditioned on the class 17 IDA2020

Bayesian Network classifiers Effects of explicitly conditioning 𝑮 on 𝐷. Pr 𝐷 𝑮 ~ Pr 𝑮 𝐷 ⋅ Pr(𝐷) ! With LearnPSDD With Bayesian features might Network classifiers " " " " ! " # $ never be features are always ! conditioned on conditioned on the the class. class. " " " " ! " # $ 18 IDA2020

Enforcing the discriminative bias: D-LearnPSDD Make sure that feature variables 𝑮 can be conditioned on the class variable 𝐷 . Minimize conditional mutual information 20 IDA2020

Enforcing the discriminative bias: D-LearnPSDD Make sure that feature variables 𝑮 can be conditioned on the class variable 𝐷 . Initializing on a Minimize conditional fully factorized mutual information distribution 21 IDA2020

Enforcing the discriminative bias: D-LearnPSDD Make sure that feature variables 𝑮 can be conditioned on the class variable 𝐷 . o However, only setting the vtree is not enough. 𝑮 still independent from 𝐷 22 IDA2020

Enforcing the discriminative bias: D-LearnPSDD Make sure that feature variables 𝑮 can be conditioned on the class variable 𝐷 . o Set 23 IDA2020

Enforcing the discriminative bias: D-LearnPSDD Make sure that feature variables 𝑮 can be conditioned on the class variable 𝐷 . o Set o LearnPSDD ensures that the base of the root node remains unchanged. Encodes a naive Bayes structure 24 IDA2020

Experimental results - 15 UCI datasets - 5-fold cross validation - Average accuracy over a range of model size - Model size is number of parameters 26 IDA2020

Experimental results 27 IDA2020

Experimental results D-LearnPSDD remains robust against missing features. 28 IDA2020

Conclusions We introduced a PSDD learning technique that improves classification performance by introducing a discriminative bias. Robustness is ensured by exploiting the generative learning strategy. The proposed technique outperforms purely generative PSDDs in terms of classification accuracy and the other baseline classifiers in terms of robustness. 30 IDA2020

References Laura I. Galindez Olascoaga, Wannes Meert, Nimish Shah, Marian Verhelst and Guy Van den Broeck. Towards Hardware-Aware Tractable Learning of Probabilistic Models, In Advances in Neural Information Processing Systems 32 (NeurIPS) , 2019. YooJung Choi, Antonio Vergari, Robert Peharz and Guy Van den Broeck. Probabilistic Circuits: Representation and Inference, AAAI tutorial, 2020. Yitao Liang, Jessa Bekker and Guy Van den Broeck. Learning the Structure of Probabilistic Sentential Decision Diagrams, In Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI) , 2017. Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche. Probabilistic sentential decision diagrams, In Proceedings of the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR) , 2014. Thank you! Contact: laura.galindez@esat.kuleuven.be 31 IDA2020

Discriminative Bias for Learning Probabilistic Sentential Decision - PowerPoint PPT Presentation

Discriminative Bias for Learning Probabilistic Sentential Decision Diagrams Laura I. Galindez Olascoaga , Wannes Meert , Nimish Shah , Guy Van den Broeck , Marian Verhelst Outline Motivation and objective Background

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Discriminative word alignment by learning the Discriminative word alignment by learning the

Compiling Probabilistic Logic Programs into Sentential Decision Diagrams Jonas Vlasselaer, Joris

Sentential complement structures Within the framework of GB, it is assumed that the following

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias

Bias in, Bias out: Gender Equality and the Fourth Industrial Revolution Debra Howcroft and

Review Selection bias, overfitting Bias v. variance v. residual Bias-variance tradeoff

go to the source The Media Bias Chart The Media Bias Chart A new taxonomy for discussing the

Implicit Bias Implicit bias Implicit bias refers to attitudes or stereotypes that affect our

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

Background on modeling for explanation Albert Y. Kim Assistant Professor of Statistical and Data

Welcom We lcome! CS 0447 Introduction to Computer Programming Lus Oliveira Original slides

24 th ILDG meeting French site report Members of "LATFOR"; VO for continental Europe

WIMP hunting: searching for dark matter Anne Green University of Nottingham Observational

Photon Not Meeting 27 th July 2017 1 TMVA Classification Can now extract the response variable

High Speed Route Lookup for Variable-Length IP Address Wanli Zhang, Xiangyang Gong, Ye Tian,

Loop Invariant Code Motion Last Time Uses of SSA: reaching constants, dead-code elimination,

Welcome! CS 360: Programming Languages We have 14 weeks to learn the fundamental concepts of

Discriminative Bias for Learning Probabilistic Sentential Decision - PowerPoint PPT Presentation

Discriminative Bias for Learning Probabilistic Sentential Decision Diagrams Laura I. Galindez Olascoaga , Wannes Meert , Nimish Shah , Guy Van den Broeck , Marian Verhelst Outline Motivation and objective Background

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS BIAS LIGHT LIGHT &amp; &amp; MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Discriminative word alignment by learning the Discriminative word alignment by learning the

Compiling Probabilistic Logic Programs into Sentential Decision Diagrams Jonas Vlasselaer, Joris

Sentential complement structures Within the framework of GB, it is assumed that the following

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Equity &amp; Excellence: Hidden Bias Implicit Bias Inherent Bias

Bias in, Bias out: Gender Equality and the Fourth Industrial Revolution Debra Howcroft and

Review Selection bias, overfitting Bias v. variance v. residual Bias-variance tradeoff

go to the source The Media Bias Chart The Media Bias Chart A new taxonomy for discussing the

Implicit Bias Implicit bias Implicit bias refers to attitudes or stereotypes that affect our

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

Background on modeling for explanation Albert Y. Kim Assistant Professor of Statistical and Data

Welcom We lcome! CS 0447 Introduction to Computer Programming Lus Oliveira Original slides

24 th ILDG meeting French site report Members of &quot;LATFOR&quot;; VO for continental Europe

WIMP hunting: searching for dark matter Anne Green University of Nottingham Observational

Photon Not Meeting 27 th July 2017 1 TMVA Classification Can now extract the response variable

High Speed Route Lookup for Variable-Length IP Address Wanli Zhang, Xiangyang Gong, Ye Tian,

Loop Invariant Code Motion Last Time Uses of SSA: reaching constants, dead-code elimination,

Welcome! CS 360: Programming Languages We have 14 weeks to learn the fundamental concepts of

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias

24 th ILDG meeting French site report Members of "LATFOR"; VO for continental Europe