Out line • Wrap up d-separ at ion • I nf erence in Bayes Net s Bayes Net s (cont ) • Variable Eliminat ion CS 486/ 686 Univer sit y of Wat erloo May 31, 2005 2 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson D-Separat ion: I nt uit ions D-Separat ion: I nt uit ions • Subway and Therm are dependent ; but are independent given Flu (since Flu blocks t he only pat h) • Aches and Fever are dependent ; but are independent given Flu (since Flu blocks t he only pat h). Similarly f or Aches and Therm (dependent , but indep. given Flu). • Flu and Mal are indep. (given no evidence): Fever blocks t he pat h, since it is not in evidence , nor is it s descendant Therm. Flu,Mal are dependent given Fever (or given Therm): not hing blocks pat h now. • Subway,Exot icTrip are indep.; t hey are dependent given Therm; t hey are indep. given Therm and Malaria. This f or exact ly t he same reasons f or Flu/ Mal above. 3 4 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson Simple Forwar d I nf erence (Chain) I nf erence in Bayes Net s • Comput ing prior require simple f orward “propagat ion” of probabilit ies • The independence sanct ioned by D- • P(J )= Σ M,ET P(J |M,ET)P(M,ET) separat ion (and ot her met hods) allows (marginalizat ion) us t o comput e prior and post erior probabilit ies quit e ef f ect ively. P(J )= Σ M,ET P(J |M)P(M|ET)P(ET) • We' ll look at a couple simple examples (chain rule and independence) t o illust rat e. We' ll f ocus on net works P(J )= Σ M P(J |M) Σ ET P(M|ET)P(ET) wit hout loops . (A loop is a cycle in t he (dist ribut ion of sum) underlying undir ect ed gr aph. Recall t he dir ect ed gr aph has no cycles.) Not e: all (f inal) t erms are CPTs in t he BN Not e: only ancest ors of J considered 5 6 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson 1
Simple Forwar d I nf erence (Pooling) Simple For war d I nf erence (Chain) • Same idea applies when we have • Same idea applies wit h mult iple parent s “upst ream” evidence P(Fev) = Σ Flu,M P(Fev| Flu,M) P(Flu,M) = Σ Flu,M P(Fev| Flu,M) P(Flu) P(M) P(J |ET) = Σ M P(J |M,ET) P(M|ET) = Σ M P(J |M) P(M|ET) = Σ Flu,M P(Fev| Flu,M) Σ TS P(Flu| TS) P(TS) Σ ET P(M| ET) P(ET) (J is cond independent of ET given M) • (1) f ollows by summing out rule; (2) by independence of Flu, M; (3) by summing out – not e: all t erms are CPTs in t he Bayes net 7 8 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson Simple Forwar d I nf erence (Pooling) Simple Backward I nf erence • When evidence is downst ream of query • Same idea applies wit h evidence variable, we must reason “backwar ds.” This P(Fev| t s,~M) = Σ Flu P(Fev | Flu,t s,~M) P(Flu| t s,~M) requires t he use of Bayes r ule: = Σ Flu P(Fev| Flu,~M) P(Flu| t s) P(ET | j ) = α P(j | ET) P(ET) = α Σ M P(j | M,ET) P(M| ET) P(ET) = α Σ M P(j | M) P(M| ET) P(ET) • First st ep is j ust Bayes rule – normalizing const ant α is 1/ P(j ); but we needn’t comput e it explicit ly if we comput e P(ET | j ) f or each value of ET: we j ust add up t erms P(j | ET) P(ET) f or all values of ET (t hey sum t o P (j )) 9 10 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson Backward I nf erence (Pooling) Variable Eliminat ion • The int uit ions in t he above examples give us • Same ideas when several pieces of a simple inf erence algorit hm f or net wor ks evidence lie “downst ream” wit hout loops: t he polyt ree algorit hm. P(ET | j ,f ev) = α P(j ,f ev | ET) P(ET) • I nst ead we' ll look at a mor e gener al algorit hm t hat works f or general BNs; but = α Σ M P(j ,f ev | M,ET) P(M| ET) P(ET) t he polyt ree algor it hm will be a special = α Σ M P(j ,f ev | M) P(M| ET) P(ET) case. = α Σ M P(j | M)P(f ev| M)P(M| ET) P(ET) • The algorit hm, variable eliminat ion , simply applies t he summing out rule repeat edly. – Same st eps as bef ore; but now we comput e prob of – To keep comput at ion simple, it exploit s t he bot h pieces of evidence given hypot hesis ET and independence in t he net work and t he abilit y t o combine t hem. Not e: t hey are independent given M; but not given ET. dist ribut e sums inward – St ill must simplif y P (f ev|M) down t o CPTs (as usual) 11 12 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson 2
Fact ors The Product of Two Fact ors • A f unct ion f (X 1 , X 2 ,… , X k ) is also called a • Let f ( X , Y ) & g( Y , Z ) be t wo f act ors wit h f act or . We can view t his as a t able of variables Y in common number s, one f or each inst ant iat ion of t he • The product of f and g, denot ed h = f x g variables X 1 , X 2 ,… , X k. (or somet imes j ust h = f g), is def ined: – A t abular rep’n of a f act or is exponent ial in k h( X , Y , Z ) = f ( X , Y ) x g( Y , Z ) • Each CPT in a Bayes net is a f act or: f (A,B) g(B,C) h(A,B,C) – e.g., Pr(C| A,B) is a f unct ion of t hree variables, A, B, C ab 0.9 bc 0.7 abc 0.63 ab~c 0.27 • Not at ion: f ( X , Y ) denot es a f act or over t he a~b 0.1 b~c 0.3 a~bc 0.08 a~b~c 0.02 variables X ∪ Y . (Here X , Y are set s of ~ab 0.4 ~bc 0.8 ~abc 0.28 ~ab~c 0.12 variables.) ~a~b 0.6 ~b~c 0.2 ~a~bc 0.48 ~a~b~c 0.12 13 14 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson Rest rict ing a Fact or Summing a Var iable Out of a Fact or • Let f (X, Y ) be a f act or wit h var iable X ( Y • Let f (X, Y ) be a f act or wit h var iable X ( Y is a set ) is a set ) • We sum out var iable X f rom f t o produce • We rest rict f act or f t o X=x by set t ing X a new f act or h = Σ X f , which is def ined: t o t he value x and “delet ing”. Def ine h = f X=x as: h( Y ) = f (x, Y ) h( Y ) = Σ x ∊ Dom(X) f (x, Y ) f (A,B) h(B) = f A=a f (A,B) h(B) ab 0.9 b 0.9 ab 0.9 b 1.3 a~b 0.1 ~b 0.1 a~b 0.1 ~b 0.7 ~ab 0.4 ~ab 0.4 ~a~b 0.6 ~a~b 0.6 15 16 CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson Variable Eliminat ion: No Evidence Variable Eliminat ion: No Evidence • Comput ing pr ior probabilit y of query var X • Here’s t he example wit h some numbers can be seen as applying t hese operat ions on f act ors A B C A B C f 2 (A,B) f 3 (B,C) f 1 (A) f 2 (A,B) f 3 (B,C) f 1 (A) f 1 (A) f 2 (A,B) f 3 (B,C) f 4 (B) f 5 (C) • P(C) = Σ A,B P (C|B) P(B| A) P(A) a 0.9 ab 0.9 bc 0.7 b 0.85 c 0.625 = Σ B P(C|B) Σ A P (B| A) P(A) ~a 0.1 a~b 0.1 b~c 0.3 ~b 0.15 ~c 0.375 = Σ B f 3 (B,C) Σ A f 2 (A,B) f 1 (A) ~ab 0.4 ~bc 0.2 = Σ B f 3 (B,C) f 4 (B) ~a~b 0.6 ~b~c 0.8 = f 5 (C) Def ine new f act ors: f 4 (B)= Σ A f 2 (A,B) f 1 (A) and f 5 (C)= Σ B 17 18 f 3 (B,C) f 4 (B) CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson CS486/686 Lecture Slides (c) 2005 C. Boutilier, P. Poupart & K. Larson 3
Recommend
More recommend