Outline • Syntax • Semantics Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1 20070607 Chap14 2 Bayesian networks Bayesian networks (cont.) • Syntax: Bayesian Networks also called • - a set of nodes, one per variable Bayesian Belief Networks, Bayes Nets, Belief Networks, - a directed, acyclic graph (link ≈ "directly influences") Probabilistic Networks, Graphical Models etc . - a conditional distribution for each node given its parents: P (X i | Parents (X i )) • A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions. • In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the distribution over X i for each combination of parent values. 20070607 Chap14 3 20070607 Chap14 4 Example Another Example • Topology of network encodes conditional • I'm at work, neighbor John calls to say my independence assertions: alarm is ringing, but neighbor Mary doesn't call. Sometimes it's set off by minor earthquakes. Is there a burglar? • Variables: Burglary , Earthquake , Alarm , JohnCalls , MaryCalls • Network topology reflects "causal" knowledge: - A burglar can set the alarm off • Weather is independent of the other variables - An earthquake can set the alarm off • Toothache and Catch are conditionally - The alarm can cause Mary to call independent given Cavity - The alarm can cause John to call 20070607 Chap14 5 20070607 Chap14 6 1
Another Example (cont.) Compactness A CPT for Boolean X i with k Boolean parents has 2 k rows • for the combinations of parent values • Each row requires one number p for X i = true (the number for X i = false is just 1-p ) • If each variable has no more than k parents, the complete network requires O(n · 2 k ) numbers I.e., grows linearly with n , vs. O(2 n ) for the full joint • distribution • For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 2 5 -1 = 31) 20070607 Chap14 7 20070607 Chap14 8 Global Semantics Local Semantics • Global semantics defines the full joint distribution • Local Semantics: each node is conditionally as the product of the local conditional distributions: independent of its nondescendents given its parents n P (x 1 , … , x n ) = π i = 1 P (x i | parents(X i )) e.g., P (j ∧ m ∧ a ∧ ¬ b ∧ ¬ e) = P (j | a) * P (m | a) * P (a | ¬ b, ¬ e) * P ( ¬ b) * P ( ¬ e) e.g., JohnCalls is indep. of Burglary and Earthquake , = 0.90 * 0.70 * 0.001 * 0.999 * 0.998 = 0.00062 given the value of Alarm . 20070607 Chap14 9 20070607 Chap14 10 Markov Blanket Constructing Bayesian Networks 1. Choose an ordering of variables X 1 , … , X n • Each node is conditionally independent of all others given its parents + children + children’s parents. 2. For i = 1 to n add X i to the network select parents from X 1 , … , X i-1 such that P (X i | Parents(X i )) = P (X i | X 1 , … , X i-1 ) This choice of parents guarantees: P (X 1 , … , X n ) = π i =1 P (X i | X 1 , … , X i-1 ) n (chain rule) n = π i =1 P (X i | Parents(X i )) e.g., Burglary is indep. of JohnCalls and MaryCalls , (by construction) given Alarm and Earthquake. 20070607 Chap14 11 20070607 Chap14 12 2
Example Example (cont.-1) • Suppose we choose the ordering M, J, A, B, E • Suppose we choose the ordering M, J, A, B, E P (J | M) = P (J)? P (J | M) = P (J)? No P (A | J, M) = P (A | J) ? P (A | J, M) = P (A) ? 20070607 Chap14 13 20070607 Chap14 14 Example (cont.-2) Example (cont.-3) Suppose we choose the ordering M, J, A, B, E Suppose we choose the ordering M, J, A, B, E • • P (J | M) = P (J)? No P (J | M) = P (J)? No P (A | J, M) = P (A | J) ? No P (A | J, M) = P (A) ? No P (A | J, M) = P (A | J) ? No P (A | J, M) = P (A) ? No P (B | A, J, M) = P (B | A) ? P (B | A, J, M) = P (B | A) ? Yes P (B | A, J, M) = P (B) ? P (B | A, J, M) = P (B) ? No P (E | B, A ,J, M) = P (E | A) ? P (E | B, A, J, M) = P (E | A, B) ? 20070607 Chap14 15 20070607 Chap14 16 Example (cont.-4) Example (cont.-5) • Suppose we choose the ordering M, J, A, B, E P (J | M) = P (J)? No • Deciding conditional independence is hard in P (A | J, M) = P (A | J) ? P (A | J, M) = P (A) ? No noncausal directions P (B | A, J, M) = P (B | A) ? Yes • (Causal models and conditional independence P (B | A, J, M) = P (B) ? No seem hardwired for humans!) P (E | B, A ,J, M) = P (E | A) ? No • Network is less compact: P (E | B, A, J, M) = P (E | A, B) ? Yes 1 + 2 + 4 + 2 + 4 = 13 numbers needed 20070607 Chap14 17 20070607 Chap14 18 3
Example (cont.-6) Summary If we have a bad node ordering: M, J, E, B, A , • Bayesian networks provide a natural representation for (causally induced) conditional we will have the network as Figure 14.3 (b). independence • Topology + CPTs = compact representation of joint distribution • Generally easy for domain experts to construct 20070607 Chap14 19 20070607 Chap14 20 4
Recommend
More recommend