Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data
Guy Van den Broeck, Karthika Mohan, Arthur Choi, Adnan Darwiche, and Judea Pearl
UCLA
UAI 2015
Bayesian Network Parameter Learning from Incomplete Data Guy Van - - PowerPoint PPT Presentation
Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data Guy Van den Broeck, Karthika Mohan, Arthur Choi, Adnan Darwiche, and Judea Pearl UCLA UAI 2015 Learning from Incomplete Data Input: data and BN structure
UAI 2015
X1 (Gender) X2 (Experience) X3 (Qualification) X4 (Income)
1 1 1 1 ? 1 1 1 1 ? 1 1 ? ? ? ? 1 1
( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
[this paper]
X1 RX1 X1 * X1 * = X1 if RX1 = ob m if RX1 = unob
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
X2 * ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
X1 * = X1 if RX1 = ob m if RX1 = unob
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 m unob
0.100 m 1 unob
0.020 … … … … … …
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Experience Income
∗ 𝑆𝑌2 = 𝑝𝑐
∗ 𝑆𝑌2 = 𝑝𝑐
∗|𝑆𝑌2 = 𝑝𝑐)
∗ 𝑆𝑌2 = 𝑝𝑐
∗|𝑆𝑌2 = 𝑝𝑐)
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 … … … … … … 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 m unob
0.100 m 1 unob
0.020 … … … … … …
∗ 𝑆𝑌2 = 𝑝𝑐
∗|𝑆𝑌2 = 𝑝𝑐)
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 … … … … … … 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 m unob
0.100 m 1 unob
0.020 … … … … … …
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020
𝑄(𝑌1) 1 X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020
𝑄(𝑌1) 1 𝑄(𝑌2) 𝑄 𝑌2 = 𝑄(𝑌2|𝑆𝑌2 = 𝑝𝑐) X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020 𝑄(𝑌2)
𝑄 𝑌3 = 𝑄(𝑌3|𝑆𝑌3 = 𝑝𝑐)
𝑄(𝑌1) 1 𝑄(𝑌3) 𝑄(𝑌2) X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020 𝑄(𝑌2)
𝑄 𝑌1, 𝑌2 = 𝑄 𝑌2 𝑌1, 𝑆𝑌2 = 𝑝𝑐 𝑄(𝑌1) 𝑄(𝑌2|𝑌1)
𝑄(𝑌1) 1 𝑄(𝑌3) 𝑄(𝑌2) 𝑄(𝑌1, 𝑌2 ) 𝑄 𝑌1, 𝑌2 = 𝑄(𝑌1|𝑌2, 𝑆𝑌2 = 𝑝𝑐) 𝑄(𝑌2|𝑆𝑌2 = 𝑝𝑐) X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020 𝑄(𝑌2)
𝑄(𝑌2|𝑌1)
𝑄(𝑌1) 1 𝑄(𝑌3) 𝑄(𝑌2) 𝑄(𝑌1, 𝑌2 ) 𝑄(𝑌1, 𝑌3 ) X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020 𝑄(𝑌2)
𝑄(𝑌2|𝑌3) 𝑄(𝑌2|𝑌1)
𝑄(𝑌1) 1 𝑄(𝑌3) 𝑄(𝑌2) 𝑄(𝑌1, 𝑌2 ) 𝑄(𝑌2, 𝑌3 ) 𝑄(𝑌1, 𝑌3 ) X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020 𝑄(𝑌2)
𝑄(𝑌2|𝑌3) 𝑄(𝑌2|𝑌1)
𝑄(𝑌1) 1 𝑄(𝑌3) 𝑄(𝑌2) 𝑄(𝑌1, 𝑌2 ) 𝑄(𝑌2, 𝑌3 ) 𝑄(𝑌1, 𝑌3 ) 𝑄(𝑌1, 𝑌2 , 𝑌3) X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020 𝑄(𝑌2)
(Alarm network) Small loss of statistical power
(Alarm network) Huge gain in computational power
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020
𝑌1
𝑌1
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020
𝑌1
𝑌1
X1 X*2 X*3 RX2 RX3 P*
0.200 1
0.100 1
0.050 1 1
0.050 1
0.060 1 1
0.040 1 1
0.070 1 1 1
0.030 m
unob 0.100 1 m
unob 0.020 1 m
unob 0.080 1 1 m
unob 0.180 1 m m unob unob 0.020
𝑌1
𝑌1
INCONSISTENT
Direct Deletion 𝑄 𝑌1, 𝑌2 = 𝑄 𝑌2|𝑌1, 𝑌3, 𝑆𝑌2 = 𝑝𝑐 𝑄(𝑌1, 𝑌3)
𝑌3
RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 General m-graph depicting MAR
Direct Deletion 𝑄 𝑌1, 𝑌2 = 𝑄 𝑌2|𝑌1, 𝑌3, 𝑆𝑌2 = 𝑝𝑐 𝑄(𝑌1, 𝑌3)
𝑌3
RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 General m-graph depicting MAR Problem specific m-graph
Direct Deletion 𝑄 𝑌1, 𝑌2 = 𝑄 𝑌2|𝑌1, 𝑌3, 𝑆𝑌2 = 𝑝𝑐 𝑄(𝑌1, 𝑌3)
𝑌3
RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 General m-graph depicting MAR Problem specific m-graph
𝑺𝒀𝟑 | 𝒀𝟒 𝒀𝟐 𝑺𝒀𝟑 | 𝒀𝟐 𝒀𝟒
Direct Deletion 𝑄 𝑌1, 𝑌2 = 𝑄 𝑌2|𝑌1, 𝑌3, 𝑆𝑌2 = 𝑝𝑐 𝑄(𝑌1, 𝑌3)
𝑌3
Informed Deletion 𝑄 𝑌1, 𝑌2 = 𝑸 𝒀𝟑|𝒀𝟐, 𝑺𝒀𝟑 = 𝒑𝒄 𝑄(𝑌1) RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 RX2 ( X1 ) ( X3 ) ( X2 ) ( X4 ) RX4 General m-graph depicting MAR Problem specific m-graph
𝑺𝒀𝟑 | 𝒀𝟒 𝒀𝟐 𝑺𝒀𝟑 | 𝒀𝟐 𝒀𝟒
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
RX2 RX4 RX3 ( X1 ) ( X3 ) ( X2 ) ( X4 )
Gender Qualification Experience Income
X Y 𝑆𝑌 𝑆𝑍
𝑄 𝑌, 𝑍 = 𝑄(𝑆𝑌 = 𝑝𝑐, 𝑆𝑍 = 𝑝𝑐, 𝑌, 𝑍) 𝑄 𝑆𝑌 = 𝑝𝑐 𝑍, 𝑆𝑍 = 𝑝𝑐 𝑄(𝑆𝑍 = 𝑝𝑐|𝑌, 𝑆𝑌 = 𝑝𝑐)