belief network inference
play

Belief network inference Four main approaches to determine posterior - PowerPoint PPT Presentation

Belief network inference Four main approaches to determine posterior distributions in belief networks: Variable Elimination: exploit the structure of the network to eliminate (sum out) the non-observed, non-query variables one at a time.


  1. Belief network inference Four main approaches to determine posterior distributions in belief networks: Variable Elimination: exploit the structure of the network to eliminate (sum out) the non-observed, non-query variables one at a time. Search-based approaches: enumerate some of the possible worlds, and estimate posterior probabilities from the worlds generated. Stochastic simulation: random cases are generated according to the probability distributions. Variational methods: find the closest tractable distribution to the (posterior) distribution we are interested in. � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 1

  2. Factors A factor is a representation of a function from a tuple of random variables into a number. We will write factor f on variables X 1 , . . . , X j as f ( X 1 , . . . , X j ). We can assign some or all of the variables of a factor: f ( X 1 = v 1 , X 2 , . . . , X j ), where v 1 ∈ dom ( X 1 ), is a factor on X 2 , . . . , X j . f ( X 1 = v 1 , X 2 = v 2 , . . . , X j = v j ) is a number that is the value of f when each X i has value v i . The former is also written as f ( X 1 , X 2 , . . . , X j ) X 1 = v 1 , etc. � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 2

  3. Example factors val Y Z X Y Z val t t 0.1 t t t 0.1 r ( X = t , Y , Z ): t f t t f 0.9 f t t f t 0.2 f f r ( X , Y , Z ): t f f 0.8 f t t 0.4 f t f 0.6 f f t 0.3 f f f 0.7 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 3

  4. Example factors val Y Z X Y Z val t t 0.1 t t t 0.1 r ( X = t , Y , Z ): t f 0.9 t t f 0.9 f t 0.2 t f t 0.2 f f 0.8 r ( X , Y , Z ): t f f 0.8 f t t 0.4 f t f 0.6 r ( X = t , Y , Z = f ): f f t 0.3 f f f 0.7 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 4

  5. Example factors val Y Z X Y Z val t t 0.1 t t t 0.1 r ( X = t , Y , Z ): t f t t f 0.9 f t t f t 0.2 f f r ( X , Y , Z ): t f f 0.8 f t t 0.4 Y val f t f 0.6 r ( X = t , Y , Z = f ): t f f t 0.3 f f f f 0.7 r ( X = t , Y = f , Z = f ) = � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 5

  6. Example factors val Y Z X Y Z val t t 0.1 t t t 0.1 r ( X = t , Y , Z ): t f t t f 0.9 f t t f t 0.2 f f r ( X , Y , Z ): t f f 0.8 f t t 0.4 Y val f t f 0.6 r ( X = t , Y , Z = f ): t 0.9 f f t 0.3 f 0.8 f f f 0.7 r ( X = t , Y = f , Z = f ) = 0 . 8 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 6

  7. Multiplying factors The product of factor f 1 ( X , Y ) and f 2 ( Y , Z ), where Y are the variables in common, is the factor ( f 1 × f 2 )( X , Y , Z ) defined by: ( f 1 × f 2 )( X , Y , Z ) = f 1 ( X , Y ) f 2 ( Y , Z ) . � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 7

  8. Multiplying factors example A B val t t 0.1 val A B C f 1 : t f 0.9 t t t 0.03 f t 0.2 t t f f f 0.8 t f t f 1 × f 2 : t f f B C val f t t t t 0.3 f t f f 2 : t f 0.7 f f t f t 0.6 f f f f f 0.4 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 8

  9. Multiplying factors example A B val t t 0.1 val A B C f 1 : t f 0.9 t t t 0.03 f t 0.2 t t f 0.07 f f 0.8 t f t 0.54 f 1 × f 2 : t f f 0.36 B C val f t t 0.06 t t 0.3 f t f 0.14 f 2 : t f 0.7 f f t 0.48 f t 0.6 f f f 0.32 f f 0.4 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 9

  10. Summing out variables We can sum out a variable, say X 1 with domain { v 1 , . . . , v k } , from factor f ( X 1 , . . . , X j ), resulting in a factor on X 2 , . . . , X j defined by: � ( f )( X 2 , . . . , X j ) X 1 = f ( X 1 = v 1 , . . . , X j ) + · · · + f ( X 1 = v k , . . . , X j ) � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 10

  11. Summing out a variable example val A B C t t t 0.03 val t t f 0.07 A C t f t 0.54 t t 0.57 f 3 : t f f 0.36 � B f 3 : t f f t t 0.06 f t f t f 0.14 f f f f t 0.48 f f f 0.32 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 11

  12. Summing out a variable example val A B C t t t 0.03 val t t f 0.07 A C t f t 0.54 t t 0.57 f 3 : t f f 0.36 � B f 3 : t f 0.43 f t t 0.06 f t 0.54 f t f 0.14 f f 0.46 f f t 0.48 f f f 0.32 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 12

  13. Exercise Given factors: val A B A val t t 0.6 A val s: t 0.75 t: t f 0.4 o: t 0.3 f 0.25 f t 0.2 f 0.1 f f 0.8 What is? (a) s × t (b) � A s × t (c) � B s × t (d) s × t × o (e) � A s × t × o (f) � b s × t × o � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 13

  14. Evidence If we want to compute the posterior probability of Z given evidence Y 1 = v 1 ∧ . . . ∧ Y j = v j : P ( Z | Y 1 = v 1 , . . . , Y j = v j ) = � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 14

  15. Evidence If we want to compute the posterior probability of Z given evidence Y 1 = v 1 ∧ . . . ∧ Y j = v j : P ( Z | Y 1 = v 1 , . . . , Y j = v j ) P ( Z , Y 1 = v 1 , . . . , Y j = v j ) = P ( Y 1 = v 1 , . . . , Y j = v j ) = � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 15

  16. Evidence If we want to compute the posterior probability of Z given evidence Y 1 = v 1 ∧ . . . ∧ Y j = v j : P ( Z | Y 1 = v 1 , . . . , Y j = v j ) P ( Z , Y 1 = v 1 , . . . , Y j = v j ) = P ( Y 1 = v 1 , . . . , Y j = v j ) P ( Z , Y 1 = v 1 , . . . , Y j = v j ) = � Z P ( Z , Y 1 = v 1 , . . . , Y j = v j ) . So the computation reduces to the probability of P ( Z , Y 1 = v 1 , . . . , Y j = v j ). We normalize at the end. � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 16

  17. Probability of a conjunction Suppose the variables of the belief network are X 1 , . . . , X n . To compute P ( Z , Y 1 = v 1 , . . . , Y j = v j ), we sum out the other variables, Z 1 , . . . , Z k = { X 1 , . . . , X n } − { Z } − { Y 1 , . . . , Y j } . We order the Z i into an elimination ordering. P ( Z , Y 1 = v 1 , . . . , Y j = v j ) = � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 17

  18. Probability of a conjunction Suppose the variables of the belief network are X 1 , . . . , X n . To compute P ( Z , Y 1 = v 1 , . . . , Y j = v j ), we sum out the other variables, Z 1 , . . . , Z k = { X 1 , . . . , X n } − { Z } − { Y 1 , . . . , Y j } . We order the Z i into an elimination ordering. P ( Z , Y 1 = v 1 , . . . , Y j = v j ) � � = · · · P ( X 1 , . . . , X n ) Y 1 = v 1 ,..., Y j = v j . Z k Z 1 = � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 18

  19. Probability of a conjunction Suppose the variables of the belief network are X 1 , . . . , X n . To compute P ( Z , Y 1 = v 1 , . . . , Y j = v j ), we sum out the other variables, Z 1 , . . . , Z k = { X 1 , . . . , X n } − { Z } − { Y 1 , . . . , Y j } . We order the Z i into an elimination ordering. P ( Z , Y 1 = v 1 , . . . , Y j = v j ) � � = · · · P ( X 1 , . . . , X n ) Y 1 = v 1 ,..., Y j = v j . Z k Z 1 n � � � = · · · P ( X i | parents ( X i )) Y 1 = v 1 ,..., Y j = v j . Z k Z 1 i =1 � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 19

  20. Computing sums of products Computation in belief networks reduces to computing the sums of products. How can we compute ab + ac efficiently? � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 20

  21. Computing sums of products Computation in belief networks reduces to computing the sums of products. How can we compute ab + ac efficiently? Distribute out the a giving a ( b + c ) � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 21

  22. Computing sums of products Computation in belief networks reduces to computing the sums of products. How can we compute ab + ac efficiently? Distribute out the a giving a ( b + c ) � n How can we compute � i =1 P ( X i | parents ( X i )) Z 1 efficiently? � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 22

  23. Computing sums of products Computation in belief networks reduces to computing the sums of products. How can we compute ab + ac efficiently? Distribute out the a giving a ( b + c ) � n How can we compute � i =1 P ( X i | parents ( X i )) Z 1 efficiently? Distribute out those factors that don’t involve Z 1 . � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 23

  24. Variable elimination algorithm To compute P ( Z | Y 1 = v 1 ∧ . . . ∧ Y j = v j ): Construct a factor for each conditional probability. Set the observed variables to their observed values. Sum out each of the other variables (the { Z 1 , . . . , Z k } ) according to some elimination ordering. Multiply the remaining factors. Normalize by dividing the resulting factor f ( Z ) by � Z f ( Z ). � D. Poole and A. Mackworth 2010 c Artificial Intelligence, Lecture 6.4, Page 24

Recommend


More recommend