Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data Guy Van den Broeck, Karthika Mohan, Arthur Choi, Adnan Darwiche, and Judea Pearl UCLA UAI 2015
Learning from Incomplete Data • Input: data and BN structure E.g., Gender wage gap study Gender Qualification X1 X2 X3 X4 ( X 1 ) ( X 3 ) (Gender) (Experience) (Qualification) (Income) 0 1 0 1 1 1 ? 1 0 1 0 1 1 ? 1 0 1 0 ? ? ( X 2 ) ( X 4 ) 0 0 ? ? Experience Income 0 1 0 1 • Output: BN parameters E.g., θ Gender , θ Experience|Gender , θ Qualification|Gender, , etc.
Current Approaches: Properties Likelihood Optimization ✘ Inference-Free ✔ Consistent for MCAR ✔ Consistent for MAR ✘ Consistent for MNAR ✔ Maximum Likelihood
Current Approaches: Properties Likelihood Expectation Optimization Maximization ✘ ✘ Inference-Free ✔ ✔ / ✘ Consistent for MCAR ✔ ✔ / ✘ Consistent for MAR ✘ ✘ Consistent for MNAR ✔ ✔ / ✘ Maximum Likelihood ✘ Closed Form n/a Passes over the data n/a ?
Problem Statement Likelihood Expectation Optimization Maximization ✘ ✘ Inference-Free ✔ ✔ / ✘ Consistent for MCAR ✔ ✔ / ✘ Consistent for MAR ✘ ✘ Consistent for MNAR ✔ ✔ / ✘ Maximum Likelihood ✘ Closed Form n/a Passes over the data n/a ? Conventional wisdom: this is inevitable!
Contribution Likelihood Expectation Deletion [this paper] Optimization Maximization ✘ ✘ ✔ Inference-Free ✔ ✔ / ✘ ✔ Consistent for MCAR ✔ ✔ / ✘ ✔ Consistent for MAR ✘ ✘ ✔ / ✘ Consistent for MNAR ✔ ✔ / ✘ ✘ Maximum Likelihood ✘ ✔ Closed Form n/a Passes over the data n/a ? 1
Missingness Graphs Gender Qualification Gender Qualification ( X 1 ) ( X 3 ) ( X 1 ) ( X 3 ) R X4 R X2 R X3 * X 2 ( X 2 ) ( X 4 ) ( X 2 ) ( X 4 ) Experience Income Experience Income R X1 X 1 + Fully observed variables X o = {X 1 } * X 1 Partially observed variables * = X 1 if R X1 = ob X m = {X 2 , X 3 , X 4 } X 1 m if R X1 = unob
Missingness Dataset X 1 X* 2 X* 3 R X2 R X3 P* • Encoding of the data 0 0 0 ob ob 0.200 0 0 1 ob ob 0.100 – Fully observed vars X o 0 1 0 ob ob 0.050 – Causal mechanisms R 0 1 1 ob ob 0.050 1 0 0 ob ob 0.060 – Proxies for X m 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 * = X 1 if R X1 = ob 1 1 1 ob ob 0.030 X 1 m if R X1 = unob 0 0 m ob unob 0.100 0 1 m ob unob 0.020 • Fully observed 1 0 m ob unob 0.080 1 1 m ob unob 0.180 • Data distribution Pr D (.) 0 m 0 unob ob 0.100 0 m 1 unob ob 0.020 … … … … … …
Algorithms • Missingness categories (classes of graphs) – Missing Completely At Random (MCAR) – Missing At Random (MAR) – Missing Not At Random (MNAR) • Deletion techniques – Direct Deletion – Factored Deletion – Informed Deletion
Missing Completely at Random (MCAR) ( X 1 ) ( X 3 ) ( X m X o ) R R X4 R X2 R X3 ( X 2 ) ( X 4 ) Experience Income
Missing Completely at Random (MCAR) ( X 1 ) ( X 3 ) ( X m X o ) R R X4 R X2 R X3 ( X 2 ) ( X 4 ) Experience Income (X 1 X 2 X 3 X 4 ) (R X2 R X3 R X4 )
Direct Deletion (MCAR) ( X m X o ) R Independencies: • (X 1 X 2 ) ⫫ R • (X 1 X 2 ) ⫫ R X2 Estimand: 𝑄𝑠 𝑌 1 , 𝑌 2
Direct Deletion (MCAR) ( X m X o ) R Independencies: • (X 1 X 2 ) ⫫ R • (X 1 X 2 ) ⫫ R X2 Estimand: 𝑄𝑠 𝑌 1 , 𝑌 2 = 𝑄𝑠 𝑌 1 𝑌 2 𝑆 𝑌 2 = 𝑝𝑐
Direct Deletion (MCAR) ( X m X o ) R Independencies: • (X 1 X 2 ) ⫫ R • (X 1 X 2 ) ⫫ R X2 Estimand: 𝑄𝑠 𝑌 1 , 𝑌 2 = 𝑄𝑠 𝑌 1 𝑌 2 𝑆 𝑌 2 = 𝑝𝑐 ∗ 𝑆 𝑌 2 = 𝑝𝑐 = 𝑄𝑠 𝑌 1 𝑌 2
Direct Deletion (MCAR) ( X m X o ) R Independencies: • (X 1 X 2 ) ⫫ R • (X 1 X 2 ) ⫫ R X2 Estimand: 𝑄𝑠 𝑌 1 , 𝑌 2 = 𝑄𝑠 𝑌 1 𝑌 2 𝑆 𝑌 2 = 𝑝𝑐 ∗ 𝑆 𝑌 2 = 𝑝𝑐 = 𝑄𝑠 𝑌 1 𝑌 2 ∗ |𝑆 𝑌 2 = 𝑝𝑐) = 𝑄𝑠 𝐸 (𝑌 1 𝑌 2
Direct Deletion (MCAR) ( X m X o ) R Independencies: X 1 X* 2 X* 3 R X2 R X3 P* • (X 1 X 2 ) ⫫ R 0 0 0 ob ob 0.200 • (X 1 X 2 ) ⫫ R X2 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 Estimand: … … … … … … 𝑄𝑠 𝑌 1 , 𝑌 2 0 1 m ob unob 0.020 = 𝑄𝑠 𝑌 1 𝑌 2 𝑆 𝑌 2 = 𝑝𝑐 1 0 m ob unob 0.080 1 1 m ob unob 0.180 ∗ 𝑆 𝑌 2 = 𝑝𝑐 = 𝑄𝑠 𝑌 1 𝑌 2 0 m 0 unob ob 0.100 ∗ |𝑆 𝑌 2 = 𝑝𝑐) 0 m 1 unob ob 0.020 = 𝑄𝑠 𝐸 (𝑌 1 𝑌 2 … … … … … …
Direct Deletion (MCAR) ( X m X o ) R Independencies: X 1 X* 2 X* 3 R X2 R X3 P* • (X 1 X 2 ) ⫫ R 0 0 0 ob ob 0.200 • (X 1 X 2 ) ⫫ R X2 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 Estimand: … … … … … … 𝑄𝑠 𝑌 1 , 𝑌 2 0 1 m ob unob 0.020 = 𝑄𝑠 𝑌 1 𝑌 2 𝑆 𝑌 2 = 𝑝𝑐 1 0 m ob unob 0.080 1 1 m ob unob 0.180 ∗ 𝑆 𝑌 2 = 𝑝𝑐 = 𝑄𝑠 𝑌 1 𝑌 2 0 m 0 unob ob 0.100 ∗ |𝑆 𝑌 2 = 𝑝𝑐) 0 m 1 unob ob 0.020 = 𝑄𝑠 𝐸 (𝑌 1 𝑌 2 … … … … … … Cf. listwise and pairwise deletion in statistics
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 1 0 0 ob ob 0.060 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 0 1 m ob unob 0.020 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 1 0 0 ob ob 0.060 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 𝑄 𝑌 2 = 𝑄(𝑌 2 |𝑆 𝑌 2 = 𝑝𝑐) 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 1 0 0 ob ob 0.060 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 2 ) 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 𝑄(𝑌 2 ) 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 𝑄 𝑌 3 = 𝑄(𝑌 3 |𝑆 𝑌 3 = 𝑝𝑐) 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 1 0 0 ob ob 0.060 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 2 ) 𝑄(𝑌 3 ) 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 𝑄(𝑌 2 ) 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 𝑄 𝑌 1 , 𝑌 2 = 𝑄 𝑌 2 𝑌 1 , 𝑆 𝑌 2 = 𝑝𝑐 𝑄(𝑌 1 ) 𝑄 𝑌 1 , 𝑌 2 = 𝑄(𝑌 1 |𝑌 2 , 𝑆 𝑌 2 = 𝑝𝑐) 𝑄(𝑌 2 |𝑆 𝑌 2 = 𝑝𝑐) 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 𝑄(𝑌 1 , 𝑌 2 ) 1 0 0 ob ob 0.060 𝑄(𝑌 2 |𝑌 1 ) 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 2 ) 𝑄(𝑌 3 ) 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 𝑄(𝑌 2 ) 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 𝑄(𝑌 1 , 𝑌 2 ) 𝑄(𝑌 1 , 𝑌 3 ) 1 0 0 ob ob 0.060 𝑄(𝑌 2 |𝑌 1 ) 1 0 1 ob ob 0.040 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 2 ) 𝑄(𝑌 3 ) 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 𝑄(𝑌 2 ) 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 0 0 0 ob ob 0.200 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 𝑄(𝑌 1 , 𝑌 2 ) 𝑄(𝑌 2 , 𝑌 3 ) 𝑄(𝑌 1 , 𝑌 3 ) 1 0 0 ob ob 0.060 𝑄(𝑌 2 |𝑌 1 ) 1 0 1 ob ob 0.040 𝑄(𝑌 2 |𝑌 3 ) 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 2 ) 𝑄(𝑌 3 ) 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 𝑄(𝑌 2 ) 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Factored Deletion (MCAR) Many ways of factorizing the estimand! X 1 X* 2 X* 3 R X2 R X3 P* 𝑄(𝑌 1 , 𝑌 2 , 𝑌 3 ) 0 0 0 ob ob 0.200 0 0 1 ob ob 0.100 0 1 0 ob ob 0.050 0 1 1 ob ob 0.050 𝑄(𝑌 1 , 𝑌 2 ) 𝑄(𝑌 2 , 𝑌 3 ) 𝑄(𝑌 1 , 𝑌 3 ) 1 0 0 ob ob 0.060 𝑄(𝑌 2 |𝑌 1 ) 1 0 1 ob ob 0.040 𝑄(𝑌 2 |𝑌 3 ) 1 1 0 ob ob 0.070 1 1 1 ob ob 0.030 0 0 m ob unob 0.100 𝑄(𝑌 2 ) 𝑄(𝑌 3 ) 𝑄(𝑌 1 ) 0 1 m ob unob 0.020 𝑄(𝑌 2 ) 1 0 m ob unob 0.080 1 1 m ob unob 0.180 1 m m unob unob 0.020 1
Recommend
More recommend