Structure Learning Algorithm Algorithm FBC-Structure( S , X ) 1 B = empty. 2 Partition the training data S into | C | subsets S c by the class value c . 3 For each training data set S c Compute the mutual information M ( X i ; X j ) and the dependency threshold φ ( X i , X j ) between each pair of variables X i and X j . Compute W ( X i ) for each variable X i . For all variables X i in X
Structure Learning Algorithm Algorithm FBC-Structure( S , X ) 1 B = empty. 2 Partition the training data S into | C | subsets S c by the class value c . 3 For each training data set S c Compute the mutual information M ( X i ; X j ) and the dependency threshold φ ( X i , X j ) between each pair of variables X i and X j . Compute W ( X i ) for each variable X i . For all variables X i in X - Add all the variables X j with W ( X j ) > W ( X i ) to the parent set Π X i of X i .
Structure Learning Algorithm Algorithm FBC-Structure( S , X ) 1 B = empty. 2 Partition the training data S into | C | subsets S c by the class value c . 3 For each training data set S c Compute the mutual information M ( X i ; X j ) and the dependency threshold φ ( X i , X j ) between each pair of variables X i and X j . Compute W ( X i ) for each variable X i . For all variables X i in X - Add all the variables X j with W ( X j ) > W ( X i ) to the parent set Π X i of X i . - Add arcs from all the variables X j in Π X i to X i .
Structure Learning Algorithm Algorithm FBC-Structure( S , X ) 1 B = empty. 2 Partition the training data S into | C | subsets S c by the class value c . 3 For each training data set S c Compute the mutual information M ( X i ; X j ) and the dependency threshold φ ( X i , X j ) between each pair of variables X i and X j . Compute W ( X i ) for each variable X i . For all variables X i in X - Add all the variables X j with W ( X j ) > W ( X i ) to the parent set Π X i of X i . - Add arcs from all the variables X j in Π X i to X i . Add the resulting network B c to B .
Structure Learning Algorithm Algorithm FBC-Structure( S , X ) 1 B = empty. 2 Partition the training data S into | C | subsets S c by the class value c . 3 For each training data set S c Compute the mutual information M ( X i ; X j ) and the dependency threshold φ ( X i , X j ) between each pair of variables X i and X j . Compute W ( X i ) for each variable X i . For all variables X i in X - Add all the variables X j with W ( X j ) > W ( X i ) to the parent set Π X i of X i . - Add arcs from all the variables X j in Π X i to X i . Add the resulting network B c to B . 4 Return B .
Example - Structure Learning Algorithm Example using 1000 labeled instances, where C is the class variable and A , B , and D are feature variables. C A B D # C A B D # c 1 a 1 b 1 d 1 11 c 2 a 1 b 1 d 1 36 c 1 a 1 b 1 d 2 5 c 2 a 1 b 1 d 2 36 7 259 c 1 a 1 b 2 d 1 c 2 a 1 b 2 d 1 17 29 c 1 a 1 b 2 d 2 c 2 a 1 b 2 d 2 227 96 c 1 a 2 b 1 d 1 c 2 a 2 b 1 d 1 97 96 c 1 a 2 b 1 d 2 c 2 a 2 b 1 d 2 11 43 c 1 a 2 b 2 d 1 c 2 a 2 b 2 d 1 25 5 c 1 a 2 b 2 d 2 c 2 a 2 b 2 d 2
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 11+5 7+17 a 1 400 400 227+97 11+25 a 2 400 400 P ( A , B )
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 11+5 7+17 a 1 400 400 227+97 11+25 a 2 400 400 P ( A , B )
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 11+5 7+17 a 1 400 400 227+97 11+25 a 2 400 400 P ( A , B )
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 11+5 7+17 a 1 400 400 227+97 11+25 a 2 400 400 P ( A , B )
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 11+5 7+17 a 1 400 400 227+97 11+25 a 2 400 400 P ( A , B )
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 11+5 7+17 a 1 400 400 227+97 11+25 a 2 400 400 P ( A , B )
Example - Structure Learning Algorithm # C A B D 11 c 1 a 1 b 1 d 1 5 c 1 a 1 b 1 d 2 7 c 1 a 1 b 2 d 1 17 c 1 a 1 b 2 d 2 227 c 1 a 2 b 1 d 1 97 c 1 a 2 b 1 d 2 c 1 a 2 b 2 d 1 11 c 1 a 2 b 2 d 2 25 The 400 data instances where C = c 1 . b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B )
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 (0.04 + 0.06) · (0.04 + 0.81) (0.04 + 0.06) · (0.06 + 0.09) a 2 (0.81 + 0.09) · (0.04 + 0.81) (0.81 + 0.09) · (0.06 + 0.09) P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 (0.04 + 0.06) · (0.04 + 0.81) (0.04 + 0.06) · (0.06 + 0.09) a 2 (0.81 + 0.09) · (0.04 + 0.81) (0.81 + 0.09) · (0.06 + 0.09) P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 (0.04 + 0.06) · (0.04 + 0.81) (0.04 + 0.06) · (0.06 + 0.09) a 2 (0.81 + 0.09) · (0.04 + 0.81) (0.81 + 0.09) · (0.06 + 0.09) P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 (0.04 + 0.06) · (0.04 + 0.81) (0.04 + 0.06) · (0.06 + 0.09) a 2 (0.81 + 0.09) · (0.04 + 0.81) (0.81 + 0.09) · (0.06 + 0.09) P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 (0.04 + 0.06) · (0.04 + 0.81) (0.04 + 0.06) · (0.06 + 0.09) a 2 (0.81 + 0.09) · (0.04 + 0.81) (0.81 + 0.09) · (0.06 + 0.09) P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 (0.04 + 0.06) · (0.04 + 0.81) (0.04 + 0.06) · (0.06 + 0.09) a 2 (0.81 + 0.09) · (0.04 + 0.81) (0.81 + 0.09) · (0.06 + 0.09) P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B )= 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B ) = 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B ) = 0 . 04 · log ( 0 . 04 0 . 085)+0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B ) = 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) +0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B ) = 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) + 0 . 06 · log ( 0 . 06 0 . 015)+0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B ) = 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) + 0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135)= 0 . 027
Example - Structure Learning Algorithm b 1 b 2 a 1 0.04 0.06 a 2 0.81 0.09 P ( A , B ) b 1 b 2 a 1 0.085 0.015 a 2 0.765 0.135 P ( A ) P ( B ) P ( x , y ) log P ( x , y ) � M ( X ; Y ) = P ( x ) P ( y ) x ∈ X , y ∈ Y M ( A ; B ) = 0 . 04 · log ( 0 . 04 0 . 085) + 0 . 81 · log ( 0 . 81 0 . 765) + 0 . 06 · log ( 0 . 06 0 . 015) + 0 . 09 · log ( 0 . 09 0 . 135) = 0 . 027
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 800
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B ) = M ( A ; B )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B ) = M ( A ; B ) + M ( B ; D )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B ) = M ( A ; B ) + M ( B ; D ) = 0 . 045
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B ) = M ( A ; B ) + M ( B ; D ) = 0 . 045 indent indent indent W ( D )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B ) = M ( A ; B ) + M ( B ; D ) = 0 . 045 indent indent indent W ( D ) = M ( B ; D )
Example - Structure Learning Algorithm Mutual information M ( A ; B ) = 0 . 027 M ( A ; D ) = 0 . 004 M ( B ; D ) = 0 . 018 Dependency threshold φ ( X i , X j ) = logN 2 N × T ij φ ( A , B ) = φ ( A , D ) = φ ( B , D ) = 4 log 400 = 0 . 013 800 Total influence M ( X i ; X j ) >φ ( X i , X j ) � W ( X i ) = M ( X i ; X j ) j ( j � = i ) indent indent indent W ( A ) = M ( A ; B ) = 0 . 027 indent indent indent W ( B ) = M ( A ; B ) + M ( B ; D ) = 0 . 045 indent indent indent W ( D ) = M ( B ; D ) = 0 . 018
B A D Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values:
B A D Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018
B A D Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018 W ( B ) > W ( A ) > W ( D )
Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018 W ( B ) > W ( A ) > W ( D ) B A D We now have the full Bayesian network B c 1 , which is the part of the multinet that corresponds to C = c 1 . We should now repeat the process to construct B c 2 and thereby complete the FBC structure learning.
Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018 W ( B ) > W ( A ) > W ( D ) B A D We now have the full Bayesian network B c 1 , which is the part of the multinet that corresponds to C = c 1 . We should now repeat the process to construct B c 2 and thereby complete the FBC structure learning.
Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018 W ( B ) > W ( A ) > W ( D ) B A D We now have the full Bayesian network B c 1 , which is the part of the multinet that corresponds to C = c 1 . We should now repeat the process to construct B c 2 and thereby complete the FBC structure learning.
Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018 W ( B ) > W ( A ) > W ( D ) B A D We now have the full Bayesian network B c 1 , which is the part of the multinet that corresponds to C = c 1 . We should now repeat the process to construct B c 2 and thereby complete the FBC structure learning.
Example - Structure Learning Algorithm We now construct a full Bayesian network with variable order according to the total influence values: W ( A ) = 0 . 027 W ( B ) = 0 . 045 W ( D ) = 0 . 018 W ( B ) > W ( A ) > W ( D ) B A D We now have the full Bayesian network B c 1 , which is the part of the multinet that corresponds to C = c 1 . We should now repeat the process to construct B c 2 and thereby complete the FBC structure learning.
CPT-tree Learning We now need to learn a CPT-tree for each variable in the full BN.
CPT-tree Learning We now need to learn a CPT-tree for each variable in the full BN. A traditional decision tree learning algorithm, such as C4.5, can be used to learn CPT-trees. However, since the time complexity typically is O ( n 2 · N ) the resulting FBC learning algorithm would have a complexity of O ( n 3 · N ).
CPT-tree Learning We now need to learn a CPT-tree for each variable in the full BN. A traditional decision tree learning algorithm, such as C4.5, can be used to learn CPT-trees. However, since the time complexity typically is O ( n 2 · N ) the resulting FBC learning algorithm would have a complexity of O ( n 3 · N ). Instead a fast decision tree learning algorithm is purposed.
CPT-tree Learning We now need to learn a CPT-tree for each variable in the full BN. A traditional decision tree learning algorithm, such as C4.5, can be used to learn CPT-trees. However, since the time complexity typically is O ( n 2 · N ) the resulting FBC learning algorithm would have a complexity of O ( n 3 · N ). Instead a fast decision tree learning algorithm is purposed. The algorithm uses the mutual information to determine a fixed ordering of variables from root to leaves.
CPT-tree Learning We now need to learn a CPT-tree for each variable in the full BN. A traditional decision tree learning algorithm, such as C4.5, can be used to learn CPT-trees. However, since the time complexity typically is O ( n 2 · N ) the resulting FBC learning algorithm would have a complexity of O ( n 3 · N ). Instead a fast decision tree learning algorithm is purposed. The algorithm uses the mutual information to determine a fixed ordering of variables from root to leaves. The predefined variable ordering makes the algorithm faster than traditional decision tree learning algorithms.
CPT-tree Learning Algorithm Algorithm Fast-CPT-Tree(Π X i , S )
CPT-tree Learning Algorithm Algorithm Fast-CPT-Tree(Π X i , S ) 1 Create an empty tree T .
CPT-tree Learning Algorithm Algorithm Fast-CPT-Tree(Π X i , S ) 1 Create an empty tree T . 2 If ( S is pure or empty) or (Π X i is empty) Return T .
CPT-tree Learning Algorithm Algorithm Fast-CPT-Tree(Π X i , S ) 1 Create an empty tree T . 2 If ( S is pure or empty) or (Π X i is empty) Return T . 3 qualified = False .
CPT-tree Learning Algorithm Algorithm Fast-CPT-Tree(Π X i , S ) 1 Create an empty tree T . 2 If ( S is pure or empty) or (Π X i is empty) Return T . 3 qualified = False . 4 While ( qualified == False ) and (Π X i is not empty)
CPT-tree Learning Algorithm Algorithm Fast-CPT-Tree(Π X i , S ) 1 Create an empty tree T . 2 If ( S is pure or empty) or (Π X i is empty) Return T . 3 qualified = False . 4 While ( qualified == False ) and (Π X i is not empty) Choose the variable X j with the highest M ( X j ; X i ).
Recommend
More recommend